THE ANNALS 
of 

MATHEMATICAL 

STATISTICS 

(kot’Noed dt II. c. CARVEn) 

The Official Journal of the Institute of 
Mathematical Statistics 


VOLUME XVI 


l{)4o 



THE ANNAUS 

OF MATHEMATICAL STATISTICS 


C C. CUAIG 
ALLEN T CRAICJ 
W. EDWARDS DEMING 


KDITED BY 
W. 8. WILKS. EdiP-t 

W. FKLI.KH 
THORNTON C*. HtY 
HAROLD HOTELI.INC 


J XHMAN 

WALTER A MILWHART 
A WALD 


W1LI.IAM G Coohban 
J. H. CUBTIHS 
J. F, Daw 
Harold F, Douai; 


WITH THK POiSPKRATlO’iJ 

I’Afl-S. Dwybr 
CHHucniLL Kter-RMARr 
Fahl li. Halmoh 
Paul (1. IIuw. 


Wtn.HMG 3 VDj»»w 
Ai.Bse^r.tu'rH M .ARtoH 
IIfrbit .S'dFrrlj 
JxCtm WVnFinWity 


Tlie Annals op Mathewatical .Statjbtr's w tiHuriuHv hy ijiu 

Institute of Mathematical Htatistica, Mt. Royal & fhiilfortl Avw., Hal'tituoru 3, 
Md. Subscriptions, renoAvals, orders for back numbers anri other bimmTOs emn" 
munications should be sent to the Annaiji of Mathemath ,\i. Statintr*#!, .Mt, 
Royal & Guilford Aves., Baltimore 2, Md., or to the Secretary of the Institute 
of Mathematical Statiatic,s, P, 8. Dwyer, 110 Rurkham Hall, rnivumty of 
Michigan, Ann Arbor, Mieh. 

Changes in mailing address which are to iKS'oine elTerlive fi*r « pvan 
should bo reported to the Secretary on or liefore the ISlli of the month priR’isling 
the month of that issue. The months of iaaue are March, June, Keplemlw 
and December. Because of war-time difiicultifw of publication, i-wjiw may often 
be from two to four weeks late in appearing. Nulwenhrrs nrr ihfffjorr 
to wait at least SO days a^ter month of issuo before nuUcitty int^uirtfs mnarrning 
non-delivery. 

Manuscripts for publication in the Annals op Matklmatuwl StATiaTia 
should be sent to S. S. Wilks, Fine Hall, Princeton, New Jewy. Manuscript 
should be typewitten double-spaced with wide margins, and the original 
should be submitted. Footnotes should be reduced to a minimum ami whcnevBF 
possible replaced by a bibhography at the end of the pap<*r; fommlac in footnotai 
should be avoided. ^ Figures, charts, and diagrams should be drawn on plafe 
white paper or tracing cloth in black India ink twice the »iw Ihty am to k 
printed. Authors are requested to keep in mind typographiciU difticnltics of 
complicated mathematical formulae. 

Authors will ordinarily receive only galley proofs. Fifty reprinte wiU«mt 
covers Will be furnished free. Additional reprints and covewi fumWiai at ctat. 

the subscription price for the Annals is 15.00 per year, ainglc copies II,SO. 

ac numbers are available at $5.00 pet volume, or 81.50 per single iwue. 

Composed and Printed at the 
WAVERLY PRESS, Inc. 

Baltimore, Md., U. S. A. 



CONTENTS OF VOLUME XVI 
Ahticlks 

Baku, Rkinhold. 'Sampling from a C’-haiiRing Population. 348 

Bhonowhki, J., and Nkyman, J. The Vnrimico of the Measure of a Two- 
niiiiTOHiotml iLmdom Set .. ... . . , 330 

Buooknku, Ralph ,f (Ihoiee of One Among Several Statistical Ilypothesoa 221 
Dwykh, Pai'd S., and WahciH, l*’uKi>KRi(rK V. Compact Computation of 

the Inverne of a Matrix . .259 

Fkdlku, W. On the Normal Approximation to the Binomial Distribution 319 

Hokl, P-'W-O, Teating the Homogeneity of Poisson Frequencies. 362 

Hfitr, L. C. Some Combinatorial Formulas on Mathematical Expectation 369 
Hat', P. L. The Approximate Distributions of the Mean and Variance of 

a Samph* of Independent Variables . 1 

Hsr, P. L. Du the Power Functions for the I'P-Test and tluCP-Test ... 278 

ILsiq P. L. Dn the Appro.ximate Distribution of Ratio.s. . ... 117 

Kac, M. Rjindom Walk in the Presence of Absorbing Barriera. . . 62 

KaplansKY, Irving. The -'Vsympfotic Distiihution of Runs of Consecutive 

Elements. .. . , 200 

IvAPLANsKY, Ihving, AND RiouDAN, .lonN, Multiple, Matching and Runs 

by the Symbolic Method.272 

Kibuhn, N, On the Design of Experiments for Weighing and Making 

Other Types of Measurements. 294 

Mann, Henry H. On a Prolilem of Estimation Occurring in Public Opinion 

Polls , 85 

Mann, Hrnuy B. On a Test for Raiidomnewi Baaed on Signs of Differences 193 
Misbh, R. V. On the Classification of Olxservation Data into Distinct Groups 08 
Nkyman, .1., and Bhonowhki, J. The Variance of the Measure of a Two- 

dimensional Random Set. 330 

Riordan, John, and KapijAnsky, Ibvino. Multiple Matching and Runs 

by the Symbolic Method . 272 

Robbins, H. E. On the Measure, of a Random Set. 11. 342 

Rodrigues, Milton da Silva. On an Extension of the Concept of Moment 
with Applications to Measures of Variability, General Similarity, 

and Overlaisping. 74 

Rubin, Herman. On the Distribution of the Serial C lorrelation Coefficient. 211' 
SuheefA, AND Tukry, J. W. Non-Paramotric Estimation. I. Validation 

of Order Statistics. 187 

Stew, Charles. A Two-Sample Test for a Linear Hypothesis Whose 

Power Is Independent of the Variance.243 

Stephan, Frederick F. The Expected Value and Variance of the Recip¬ 
rocal and Other Negative Powers of a Positive Bernoullian Variate. . 50 
Tukey, j. W., and Schefp^, H. Non-Parametric Rstimalion. I. Val¬ 
idation of Order Statistics. 187 

Vajda, S. On the Constituent Items of the Reduction and the Remainder 
in the Method of Least Squares...381 


lit 













iv 


(IF vttJ.l'Mf 


Wald, A. Hctiacutial Tchls lif StalMiml 117 

Wald, Ahuaham. Some (Scncmliyatitins *»f «!»- Thr'-AV >4 

Rums of Random Vanalilo 2S7 

Wald, A , and Wolfowiiy,, J. Sampling riat:“ i»<r t t'Ufmui.n-, 

rroductioii \sldt'li Insuro a PicsfrilH'd I.iuiif tm <!*«• ‘ioig-fuig ijto.dpv :if( 
Waugh, Fhkih.uick v., and Dwykh, Rmt. S ('mfifirKf ( <voi<ns a'wji »iS 
the Inverse of a Matrix o"y 

WoLFOwm, J,, \ND W\IA), A. Sampling Iti'-^jarUioi I’l 4te !‘<f t '*'!»• lll’J'All- 
Rroducliun whieh Insure a IVeserilHal Limit mi the i tmgoHsg tpialilii ;Ht 


Id ia 


-md Il> 
I’Aini’- 


Noik^ 

BANCiioFr, T. A. A'ote nn an Identity in llie Imaimjilt i«* lU-t,* I on 
Beruy, ('lipeohd I’i. A (’rit(>ri(iii of ('onvergenec fur the ( 'he>-e','( 
tu’ci Method of Solving Linear Siniultaneoua l epiatmjih 
Chung, Kai-Lai, and II.su, Lili/ (’ A ("omiiiiiatoii.d l-Lrni'iI-i -oi'l it • 
Aiiiiliealion to the 'riieory of I’roLahilily of Arloirai.v Iav»(1' 

Feller, W, Note on the Law of Lirge NhiiiIhT’' ninl 'iuiir"' tlatin' 
ClEiniNGKii, Hilda. I'lirlln'r Itemark.-* on Lnlmg' 'I’h.-.n m Mm-h Imn 
Heredity 

GeiiunCtEU|1Iilda. On (Ik* llelinioon of DManeiMn the Tie oi y mI fle t ien. 
IIahbhbauger, lioYi). On the Analysis ol a feitain Mx Siy •■ix l our 
group Lattiee De.dgn using tlit* Reeoveiy oi Inii-t l.loek Infoim.atioji 
Hsu, IjIEtz C., \nd C'nuN'd, Kai-Lai, C'omlimalouid Fononla 
Application to tlie Theory of Proliuhilily of Arlnlraiy Iai ntn 
Kac, M. a llemark on Imlependenee of Lineai and t^inoliain 
Involving Independent (Jans.sian Variahles 
Koss.ack, Carl F. On the Meehanies of ilasailieation 
Madow, William («, Note on the Dwtrilmtion of the Seind f'uueisition 
Coefficient .... 

Mann, H B, Note on a Paper by C. W. Cottermun ami I,. H. Snyder 
Tintner, Gerhard, A Note on Itank, MnUieollineanly ami Mnlfiple 
Regression . 

W.AUGH, F. V. A Note Concerning IlotellingV Method of Inverfing 
a Paititioned Matrix. ... 

MiaUELL-ANKOUh 

Abstracts. 

Annual Report of the President of the Institute, 

Annual Report of the Rccrotary-Treasurer of the Institute 
Constitution and By-Law.s of tlio Institute. 

Directory of Members of the Institute. 

News and Notices . ’ 

pZ'T Committee on Post-War Development of the In-li’tuu' 

Report of the Membership Committee of the In.stitiite 
Report on tire Rutgers Meeting of the Institute. 


m 

tn 

:h«j 

atti 

:iHr 

111 

KKl 

115 

rttss 

aw 

2Ui 


4tl2 

1(13 

1(15 

112 

109 

405 

109 

107 

317 



THE APPROXIMATE DISTRIBUTIONS OF THE MEAN AND 
VARIANCE OF A SAMPLE OF INDEPENDENT VARIABLES 

By P. L. IIsu 

The National Vnivtrsily of Peking 

1. Introduction. lu this paper jve shall study the mean and variance of a 
large number, n (a sample of size n) of mutually independent random variables’: 

(1) fi. . ■ ■ • ) , 

having the same probability distribution represented by a (cumulative) distribu¬ 
tion function P{x), The rth moment, absolute moment, and semi-invariant of 
Pix) are denoted by ar, dr ■ and y, respectively, It is lussumed that for a certain 
integer fc > 3, d* < and that at > 0. lienee there is no loss of generality in 
assuming that 

(2) «! = 0, aj = 1. 

Tire eharaeter’istie. function corresponding to Pix) i.s denoted by pit), 

We put 

( 3 ) 

n f-i n r-i 

(4) F{x) « Pr(Vn I < x), G{x) « Pr (< 

I Voi - 1 

The definition of C7(i) implies that (x* < » and on -1 > 0, I’ho case on -1 = 0 
provides an easy degenerated case which will be treated separately (section 4). 
Cramer’s theorem of asymptotic expansion' reads as follows: 

Theohem 1. If Pix) is non^mngular and if fit. < w for some integer k > Z, 
then 

(5) Fix) » $(x) -b Sl^(x) -b R{x) 
vihere 



'F(i)i8acertamlinearcombinationof3ucce88ivederivativea'F'’'’(x), > < - ,$''‘*"*”(x) 
with each coefficient of the form ft’"*’' times a quantity deiKtndmg only on 
fc, «»,••• , (,l ^ V < k ~ 2i) and 

( 7 ) 

where Q is a constant depending only on k and Fix). 

1H. Crau£r; Random Fartoilet and Probabilily Dtslribuliom (1937), Ch, 7. This book 
will be referred to as (C). 


1 



2 


I. Hi-*’ 


In particular, putting h - 3 get tluit hr^ 4 ^ - tjti * prnYi>W 

P(x) is non-singular and A < If *ln* ('j.snliJn'n "i nf 

P(x) be removed, then Lia{KKiiKififV thH«r«-ni' ffinn^hra th*' wr^lrr r«wlt: 
I F(x} — 4>[x) ! < Apin log ri wheri* .1 1 “ a numf n^fd renstiU,'! 

Very recently Berry* pucceeded in remio ing the iafUif S'*g ^ I 
theorem under no other condition than that »«* ^ >-. Wf lore IWirv’* 

theorem; 

TnEonKM 2. IJ /Jj < '«, thru 
( 8 ) 

where A is a numerical conelanl. 

An essential step in the proof of thmt m?uU« i« flit- wh-eti'tn of a weighliag 
function wix) and the appraisal of the integral 

(9) f w(u){F(u -f x) — “hCu + x) "■ '!'(« 4 x^S th 

(^ = 0 when fc ~ 3). In his book* Cramfr j^oves 'I*hi*on*in 1 hv faking in iP « 
1 

(-tt)"”* when u < 0 and «j(u) » 0 when 

(10) * w^O (0<«<15 

and proves Liapounoff's theorem by taking 



On the other hand, Berry uses the following weighting funrlion in Im proof of 
Theorem 2: 


( 12 ) 


w(u) 


1 " cos Tu 
u* 


The unfortunate selection of the function (11) accounts for the prtwt'uec of the 
factor log n in Liapounoff’s theorem. 

Now Cramer’s proof of Theorem 1, based on the integral (S) with «•(«) defined 
in (10), makes use of a result on that integral due to M, Riesz, A more ete- 
mentary proof than this can be devised. In fact, one has tjiily to u», with 
Berry, the function (12) and to adopt his elementary appraisal' of the integral 

* (0), Gh. 1. 

A. C. Bbhrv: “The docurnoy of the Gaussian approximation to the auni of indepvtutenl 
vanates. Trons. Anw. Math. Soc., Vol. 49 (19-U), pp. m-13a, This pai>cr will fc# re¬ 
ferred to aa (B). 

* Berry proves the inequality (in our notation): 



DISTRIBUTrONB OF MEAN AND VARIANCE 


3 


(9) in order to obtain the proof of Theorem 1. One of our purposes is therefore 
to give an elementary proof of Theorem 1, without reference to the above- 
mentioned result due to M. Riesz Section 2 is devoted to this work. 

We ought to add that Cramer’s theorem and Berry’s theorem correspond to 
Theorems 1 and 2 for the case in which the random variables (1) do not follow 
the same distribution. The proof given in Section 2 is adaptable to these more 
general theorems when subjected to appropriate modifications; the assumption 
of a common distribution function for (1) is only made for the sake of con¬ 
venience. 

So much for the known results for the approximate distribution of (. By a 
purely formal operational method Cornish and Fisher^ obtain terms of successive 
approximation to the distribution function of any random variable X with the 
help of its semi-invariants. It is hardly necessary to emphasize the importance 
of turning Cornish and Fisher’s formal result (asymptotic expansion without 
appraisal of the remainder) into a mathematical theorem of asymptotic expan¬ 
sion which gives the order of magnitude of the remainder. In this paper we 
achieve this for the simplest function of (1) next to viz. the rj in (3). We do 
not seek to remove the assumption of a common distribution for (1), as there 
will be no practical significance (e.g, in statistics) of v if the variables (1) do not 
have the same probability distribution. Section 3 is devoted to the proof of 
the following theorems: 

Theorem 3. Jf as < «> and — 1 — aj 0 (it cannot he negative), then 

03) 

where A is a numerical constant. 

Theorem 4. Let P(x) he non-singular and let au < “ for some integer k> Z. 
Then 

(14) 0{x) = i(x) -f xix) + Ri(x), 

where ^(x) is the function (6), x(x) is a linear combination of the derivatives (x), 
• > ■ , with each coefficient of the form n~^' limes a quantity depending only 

on k and at, on, • • • , au~i, and 


(B), p. 128. The “appraiflal” mentioned hero refers to (60) ■which is oontoined in B, p 128. 
But Berry’s appraisal of the integral in the right-hand side of the above inequality is in 
default. He writes 


5 i"' (t - - T 4/5-5- 5 £“ 


(B, p. 132, line 3) whilst the last integral ought to be 



— e)t* 4- c — 


dt. 


' E. A. Cornish and R. A. Maher: “Moments and cumulants in the specification of dis¬ 
tributions.” (Re'vue de I'Institut International de Statistique (1937), pp. 1-14.) 




4 


p. t, mx" 


(15) I Ri{x) i < if h ^ -U 5 or f» 

(W) |S‘WlS;.„4*r„4.. if* S’ 

where Qk and Qi are constanis depending only on k and 
It may be noticed that Theorem 3 is a "Hcrrjnan*' tbenri-m alumt fffxj, }(« 
characteristic feature being the absence of any rtmtlitton (m the dktrifmtiatt 
function except tlie two on it« momenta, and that Theort-tn 4 i.« a “tVampriftn” 
theorem about G{x), the characteristic feature la'ing the wmjmjitifm «f non- 
singularity of P{z) besides that ««,<!». 

In proving these theorems wc have devised a methml which is applicaldt' to 
gettmg similar results about functions other than »j, mmh as funetumn com¬ 
monly used in applied statistics; the higher moments about the means, the 
moment ratios (e.g. K. Pearson’s hi and b,), the covarianee. the cwflieient of 
correlation, and “Student’s” t-statistic. Works on such funeli*tns are being 
done by my university colleagues, and the results will be published shuftly. 

If f is any of the random variables (1), then 

0 < tla({* — 1) + bf) •= — 1) + 2abaj + b* 

for all real (a, h). Hence m “ 1 ”■ on ^ 0, and m 1 — n* w o means that 

there is unit probability that £ assumes exactly two voltics, This easily flegcjife- 
rated case is first eliminated in Theorem 3 by the assumption en - - 1 ■ - <t> s*; 0 

and then considered in section 4. In Theorem 4 the condition - 1 - or* 0 

is implied since £ cannot be a random variable of the nature just iieserii wl owing 
to the non-singularity of P{x). 

2. Lemmas. Throughout this paper A, B, C, etc, will denote pewitivo numeri¬ 
cal constants; Ak, Bk etc., will denote positive constants depending 

o y on some integer k (integers k and m), and 0* (Q*n) will denote a positive 
constant depending only on k {k and m) and the distribution function Fix), 
^ denote respectively quantities such that j d | < L 

[ (iQtinl < Air„), jAkj < Qn (I A|,« j < Q*,*). These 

same quantity at each occurrence, 
nus Zt? - 0, fcefc = Gfc etc. In particular any positive functions of A*, or,, • • - , or* 


nffsyraptotic expansion of the diaracteristie function 

In vtSu t when (1) do not have the same distribu- 

) <1 or 1 < I < Qi,n , Since we assume a common distribution for (1), 

. I / 4 W n v't 


so that the characteristic function 




we are able to derive an 


V wn/j " 

asymptotic expansion valid for | i 1 < Q,Vn. The extension to 



DISTniBUTIONS OP MEAN AND VARIANCE 


5 



presents no difficulty. 


This is done in the following three lemmas, 


of which Lemma 3 contains the final result. 
Lemma 1. 


(17) 


log p(0 = g + Qtfik I < I*, for I < I < 


Proof; Since p(t) = 1 + 2 “- w - + — t t-- = 1 + q(t) say, we have, for 

f"*! n fC! 


< 1, 


5 ( 1 ) < fi -»- 2 < ?. 

r-S r\ r-2 rl r-2 Tl 4 


Hence 

(18) log p(f) = 


2 (-i)i+>kMVe|g(0 


For 1 < j < [Kfc - 1)] lei UB expand each (- 1)''^V~*{9(01' io get a polynomial 
3,‘(0 of degree k — 1 and a remainder r,(t). In doing this we regard g(t) formally 
as a polynomial of degree k in i. For this polynomial we have the majorating 
relation 

qit) « 

whence 




which gives 

(19) I ry(<) I < 2/^^^^ < //3*1 1 1 1 1* < ^*^,1 1 1*. 


Similarly, 

(20) I qit) I <A^,\t\^ 

From (18), (19), (20) we obtain 

(21) log pit) = 2 qjit) + 9t|3)fc| 1 1\ 

ifiysiTcfc-Dl 

Since the sum in (21) must equal the sum in (17), the Lemma is proved. 

Lemma 2, Let (fi, fa i • ■ ■ , f«) he a random point vnth «(f<) >= 0 and 
«(| ti I'') = $kt < «> for some integer * > 3 (i = 1, • • ■ , m). Let pih ,••*,<«) 
be the characteriatic function. Then for | | < (f = 1, • • • , m) 

noe have 



6 


V, U HRV 


■where Ur and Vr are the rth smi-invariant and the ah.%ulviU' numml rrsprctivdy of 

^Sooprlf |i.-| < then Vj'* < “‘-‘(I'A*!«. iV * < 

jyj£fc-n/)!(' 2 jgj‘(* I i,|) < ■y/n. Since p value at I » of 

the characteristic function of Shf,-, it follows from I^mma 1 that for y/n > 
Vi'^ we have (22). 

Lemma 3. Let (ii , ■ ■' Am) be a random point mlh ((^#1 " 0, 1 and 

«(| f.' I*). - Pki < ^ far some integer k > Z. IM p„ « « 1; L i * 1, 

■ - ‘ , m) and the matrix ji po II be positive definite. Lei 


I r ^ 

(23) A = det.l Pi/1, , O ® 

Let p(ii, • , i„) be the characteristic function, Thai there exists a Him such that 

for 1 L 1 < (i = 1, •' • , m) tne have 

^ki 

f (^' ■■■•‘Oil +#Wi. 

(24) 

+ii,r + ■•• 

where f {ih , • ■ • , ft«) is a polynomial each of whose terms has the form 

(ilm)'", 


withl <v<fc — 3, 3<xi+''- + >»m< 3(jfc — 3), and depending only 
on h and Oie moments e(f^' ■ • • f^"), 3 < pi + • ■ • + Mm < Jfc “ 1. If k Z, 
then ^ = 0. 

Proof. If j t.-1 < V^, then j L | < Vn aince- 

A < 1 and ^*i > 1. It follows from Lemma. 2 and the fact Ut » ^pitUi that 





(26) 

WVn’- 

v^)/ “ 



w tp{k, • . 

where 




fc-8 j 



( 26 ) 




i^Ur^ 0*y* 
Vn n:o (r + 3) 1 n'‘^ ' 



DISTRIBUTIONS OF MEAN AND VARIANCE 


7 


Regarding s formally as a polynomial in n~* let us expand each (1 < 

j < fc — 3) to get a polynomial s, of degree — 3 in n~^ and a remainder r,. 
For the formal polynomial s v e have the majorating relation 

® v; hr’n- « v;S Vi ‘ ■ 

whence 


li s’ « Ai 


■fr*ilk ..t , 

►'* n-1 




j/1 


which gives 


r;| < 


AkW"’ ^ /Fi'* 


^*_LL 

nil- 


- < 




Since < 1 as shown in the proof of Lemma 2, we have 

— 'ni(i‘-v — nfoM) 

Since 0k, > 1 we have Hence 


kil < 


(28) 

Similarly 

(A: - 2)1 

From (25), (28), (29) we get 


ni(k-v 


< — 


7|Rfc-s) 


— »>(<1 , • • - jim) (1 4- 4^iih , • ■ - , itm )} 


k-i 


2 )! 


J^l 


+ I2^!!‘‘”'*(I 1.1* +11.1*" + • • • +1 (< 1'“-”) 1 Ki., • ■ •, I.).'*' 

where ^{ih , ■ • • , it„) stands for Ss/. The assertion about fdh , • • ■ , iU) 
announced in the lemma can now’ be seen without difficulty. It remains to show 
that with suitable Bkm in the lemma, we have 


viti, 4,)e*'' < e 




rf-l 



8 


p, I-. mv 


i.e. 

(30) „ ^ 

JS ti/~l 

From (27) we have 


s £ PitUij + laj ~ S • 


■Im*” 


(31) 




’k\> k 




If we choose < (47^"'/li„)' ‘ (and in. order that the eftrlier 

results may not be ofiected), the d*#. here coinciding with the last written 
in (31), we have, for ! h ] < n, 

02) >ls—ifi:. 


On the other hand, if Xi, Xs, • • • , Xm are the latent roots of f, /),, ;t then each 
\{ < m smce their sum is m. Letting Xi be the smBllest one we hakv 

(SS) E z a Z ■ 

(32) and (33) imply (30). Hence the lemma is proved. 

Let us write down the particular cases w = 1 and m « 2 of (24): 


(34) 




+ 


Qk 








-4*\/A 


h Ji.M" -tci5+‘J+*'‘»‘«> 

Vn’ ■y/n/j 


« e 


{1 + ^(ih , iti) 1 


(“) + {§ i‘+11. r‘ + ■ • ■ +1 (, c'"' 






A*(l — p^)-\/n 

;S7S I 
Pkx 


Ahti) 


)■ 


More specially let us rewrite (34) and (36) with fc » 3; 


(36) 


(37) 





f / h h XV -l(*f+<J+spii<i) 

+ I <. r + 11., (ms ^(1 - »Wn ^ , 



PISTRIBUTIONS OF MEAN AND VARIANCE 


9 


In. this paper only these last four formulae are needed; they are used in the 
proofs of Theorems 2, 1, 3, 4 respectively. Cases of > 2 of (24-) will be 
needed for the works on other functions alluded to in the introduction. 

1.2. In the following group of lemmas, which culminate in Lemma 7, one 
finds a generalization of the Riemann-Lebesgue theorem, viz. Lemma 6, 

Lemma 4, Let j{x) be a polynomial of degree m > 0, vnth real coeffidenia: 


(38) f ( x ) = f) a.-x"'-' (Ofl 0) 

Then 

Proof: It is sufficient to prove the inequality for / cos f { x ) dx . Divide 

Jo 

the interval into Am sub-intervals in each of whose interior none of the deriva¬ 
tives f'^(x) (t = 1, ■ • • , m) vanishes. It is sufficient to consider one of these 
si^mtervals, say (a, b). Consequently each of the polynomials f^{x) are 
monotonic in (a, b). Let 

(39) ^ ~ f cos/(a:) dx. 

Suppose first that f{x) is positive and increasing for a < x < b. Then 
I j I < e 4- I r /'(=g) cQa/(x) dx I 

“ I fix;) I 

‘ + f '( a+tj I‘ 

by the second mean-value theorem. Hence 


(40) 


l-ri < ‘ + 


2 

f'(a -h e) 


NowO </'(a -t- ^t) = /'(a -t- *) — t/"(a -f &e)/2, ^ ^ < 1. Hence f(a -f- 

«) > + ®«)- Since/"(x) is monotonic, we have either f(a + t) > j[<f" 

(a -1- <) orf(a + <) > + h)- In other words, there exists a constant C », 

independent of a or «, such that i < Cj < 1 and/'(o + e) > + Ctt), 

lif"ix) > 0, we have, as before/"(o -f Cj«) > -i- Cs*), where Cj 

is independent of a or e and i < (7» < 1. If fix) < 0, then, since 
0 < ria 2(7,*) = /"(a -H (7,*) -f Ctf'ia -h e.Ctt), * < < 1, wo have 

f'ia + Cif) > —Cttf'ia -H 201(7,*). Aa fix) is monotonic, either/"(o -h 
(7,*) > -C,tfia + C7„) orr(a + (7,*) > -(7,*/'"(a -H 2C,e). In all cases 
we obtain/"(a + Ctt) > R,* |/"'(a + (7,«) (, where 5, and (7, are independent 
of a or *, and i < Cs < 2. Hence fia -+-*)> ] fia + Ca) [. Argumg 

with d=f'"ia -t- Cst) as we did 'srith/"(a (7,«), and so on until we come to/*"’^ 



10 


r. t »'* 


' / * UJ t 


ft. 


bI 

that ('r,t < " «• J»t»'1*IB'l!'(V }- !riir, 5hrn / ; < 

,1 I," il(*nr<‘ thf k'rnma i-i lni»* Uit fU' gii*-! uirrpiM?- 


j] -‘5«UllIlf 

llif’ fwici 


jnI tf 


wQ obtain /'(« + «^ > 
in (40) and puttinR e - 

presuppoaoH t, 
b ~ a < Cm 1 *■ 

ing in (o, W- 

If/'(i) is positive and dcsn-rw-ing in fn. t>\ | ' /' v '’5 % 

~i(b - y) being a ixdymnnial isith (he lending r<*(‘Shr}<Ti( i/k nfsd th< lirei 
derivative/'(t( *“ /h "diieh is punitive and inerpaj«mR 'Ilnw »■»?#• rvdnrt-fi thet*. 
fore to the preceding one. Finally, if f'U( bf ncgafivc, wr- ha-..* nnly ta nciUng 

that J « I C 08 (-/(3f)) dJ’' Hence the lemma i% prt!vc?l 
Lemma. 5.** Letf{x) be the polynomial f.lHat, atirf Iff «, ^ O/kt r, (|r m. 
Then 


(41) 


,'1 
; <*9 


(It < 


dm 


Proof; We may assume that ) m! > I. C4Fi beinR trivial if n, * 1, if 
r = 0 this reduces to I/'nima 4, SupiMWM^ that the lemnm i*. inm for itfl, ni, 
... I,et/i(x) “ OfrT" + ••• + 0, ix" /j'xf bx' " /prl and 

divide (0 1) into Am 8ul>-intervalK in. each of \vhi<’h J^(x^ i» mnnutomc. It « 
sufficient to consider one of these Kulvinten-alK. say. fa. In. Wi> hnvr 

Z = f cos {fiix) +/a(J')} dx 

*'a 

sin /)(x| sin /j(x) (k. 


j cos/i(x) cas /)(t] (ix 


We have only to consider the integral of cosines, say J. Divide (a, h) inlt> mib- 
mtervals in each of whose interior cos /i(j) is monolonic and doe» not vanish. 
The number of such intervals does not exceed (iir)'' /ifhl /ifal J < 
(^’r)“\l/i(b) I + |/i(fl)l) < 2(1 ao 1 4- • • • + 'a. I :b Then, by the weottd 

mean-value theorem, 

1/1 < 2(lool+•••+'Or-jl)* / COS/s{i) f/x' (o < hi h). 

I *'4 


Hence, applying Lemma 4 to Mx), we get 

/Ao\ 1 tI ^ Am(kl + ■- + |ar-i| ) ^ xUdoti. -f ••■ + !«> >;) 


Of 


On the hypothesis of induction ^\e have 1 Z j < A^ j ct, j’ *” (j » 0, ... , r - 1), 
If 1 a. 1 > I Or 1''’“'" for some i < r, then in < A„ j a„ i if I o, I < 
1 a, 1'^*"', then by (42), 1Z | < Am | | The proof is therefore complete. 



DISTRIBUTIONS OF MEAN AND VARIANCE 


11 


Lemma 6. Let f{x) he the polynomial (38a) and g{x) be summahle over (— k> , co). 
Then for every r we have 

(43) lim / g(x) dx = 0, uniformly in a^ii r). 

Proof: By Lemma 5 We have 

Hm / dx — 0, uniformly in a,(i 9 ^ r). 

•'0 

Hence 

rb 

(44) lim / dx = 0, uniformly in a,(i -A r) 

|Q,|-«r» Jo 

for if a 0 and h A 0 , then (a, b) is the sum or the diSerence of two intervals of 
the form (0, c) or (c, 0), and for the latter intervals the transformation x = dccy 
reduces the interval of integration to (0, 1). 

Let G be any open set of finite measure. Then G is the sum of a sequence 
(J,) of non-overlapping intervals. Since 2m7, = mO < w, we have 

mj, < t, n > W. 

Fin 

Hence 

»-i I Jj, 

which, together with (44), implies 

(46) lim / da; = 0 uniformly in a<(i A r). 

Let S be any set of fimte measure. Then there is an open set Q such that (? 3 5 
and 7n(G — »S) < e. Hence 

< * + 

Hence, by (45), 





(46) lim / di = 0 uniformly in aj(i A r). 

*'3 

Now let hix) be any positive “simple" summable function, i.e. h{x) « 0 , > 0 
for X«)S (v = 1 , 2 , • •' , n) and h{x) =» 0 otherwise, Since h{x) is summable, 
each S, must be of finite measure. Hence 


r hix) dx <ta, [ e‘/<*’dx 

J-M l—l J«. 


which, together with (46), implies 


lim r h(x) dx 


0 uniformly in aj(f 5 ^ r). 



12 


P, t.. HM' 


Finaiyy, let g{x) be any flammable function > 0. Tbcn by a %vpH-lEf»rwn 
rem'wehavesfi) = limwhere is an atwcndinK w^jvmnfi'nf 

summable aimple functions. Hence 

I r e'^<”0(x) dzl< J dX'^+ [ (gitt “ K(yU dr. 

By monotonic convergence the last integral tends t« ft m n •"* « . Hrnce 

I f* g(x) dzUt + 'T e"’*’ K(z) ds ,, 

I X*N i! 1 

which impliea (43). If g{x) La any summable function, we have only to ct<n«iifer 
the customaxy expression of p(r) as the difference frf two non-negative functioo#. 
This completes the proof, 

Lemma 7. Let P(x) he a non-singular distnbulion function of « random wnohls 
X, and let 


(47) 


p(tv I (it 


fe‘ 


S I , 

r -1 


dP. 


Then for eoerj/ r and every positive constant c vx fmt 
(48) I,u.b.lp(<,, •, Uj < 1- 

Mfli* 

Phoop: We have Pfi) « ajPi(i) + o»P»(x), where P>(x) is almolutely con¬ 
tinuous, Pi is singular, ai > 0, a> -f o, w L Hence 


lp(h,<3, ,(».)1 ai|£^e' P((ar)dx| "f <H. 

By Lemma 6 we may, find C > 0 such tdiat 

1 p(ti, ts, ■ • ■ , f«) I :< !«! + 0 * < 1, if any j £, 1 > (?, 
Suppose that 


l.u.b.p(£i, '••,£«) » 1, 


then c < C and we must have 


(49) 


l.u.b, 

cSillrliSC.lUlaiCCMr) 


lp(h) ’'' > 01 " 1. 


Since p(ti, ■ • • , („) ie a continuous function, it must attain itadeast upper bound 
m any bounded closed set. It follows that there is a point (it, • • • , d) such 
that b 0 (1 b 1 > c) and p{i \, ■ • •, tj,) » 1, But this impUea that the 
distribution of l^uX' is discrete, i.e. that the distribution of X itself is dmorete, 


• H. Kestelman; Modern Theonea of Integration ( 1937 ), p, 108 . 
’ Cf. (C), p. 26. 



13 


PISTRIBUTIONS OF MEAN AND VARIANCIv 
which contradicts the non-singularity of P(x). Hence ('f^) ^ ' 


true. 


1^. In his cited work Berry' shows that if Fix) is ft''/ 
and if #( 1 ) is the function (6), then there is a constant a sue 


/: 


1 — coa iTa : 


{Fix ~\r a) — 4'(x -}- a) 1 (ic 


where S = /^^Lu.h. \Fix) — $(x) |. This is easily extended to the foUo g 

lemma, which needs no further proof. , having 

Lemma 8. Let Fix) be a distribution fundton and Fiix>) 1 as x 
the following properties: (i) Fxix) is bounded for all x, (ii) \\cM. Le’t 

Fiix) —* 0 as X —* — “, (iii) Fiix) has a bounded derivative, | r — 

^ = dt- 1 ~ I ■ 


2 M 

Then there exists a constant a such that 
1 — cos Tx 


(61) 




X* 


(Fix -f- a) — Fiix + a)) dx 


I 


T» 1 - 


> 2MTS<^Z 

1.4, In section 3 we define, for given «, k, X and z, a function 
(62) Oix, v) = e-'"'’' if z < x < z -f 'Ky\ Oix, v) ^ ^ 

The introduction of (?(x, y) and the appraisal of its Fourier ex^Mion 

the essence of our method of solving the problem of the asymP about 

of the distribution function Gix). The solution of the same 
other functions of (1) alluded to in section 3 is based on tli® lemma- 
functions playing the role of Gix, y). We now prove the following 
Lemma 9. Let Gix, y) 6e defined by (52) and let 


(63) 

ffCti, ti) = 

Then 


(i) 


(ii) 

VI 

(iii) 

\giti, fi)l < j-^j( 


(x+^;!^+^’) if.-3, 


• (B), p. 128. 



14 


P. 1,. HW 


Proof; 

(i) I g{h ,h)\< fr(x, y) dr dy |/*e '*”■ du = 

(ii) Putting ^: = 3 we liave 

, to “ ~ £ e"-‘■''’■'O “ « 

l«(l..tf|<|-^|£_U(v)B"'(l,)rf|,l. 

where u{y) = e""'*(l - e“’''^»'), v(,y) « On integrating liy part* we 

obtain 

(64) 1 g{k , t») 1 < I viy)u'*'{y) dy | < ! u"'(y) 1 dy. 

Elementary calculation eatablishea that 

< e"“'‘(216\e’lj/r + 75f)Xe*tl/r 

In I 

+ zzm U r + 8\*! h r i y i’ + 12\’ 11, i i y i). 

Substituting in (54) and making the tranafownationj/ » (“*^*xwe get the result, 
(iii) We have 

1 9{k . <») 1 < ,-^1 1 £ dy |. 

Integrating by parts twice we obtain 




dy. 


By elementary calculations we get 


1 gik , fe) 1 < £ (4^’Xey‘* + 2fe(fc + 3)X«y“ + 4X‘ | /i| y* + 2X)e~"f'‘ dy 

which, on the transformation y = gives the result.^ 

1 Ji. We prove a few additional lemmas used in the proof of Theorems S and 4. 
Lemma* 10. Let u(xi ,•••,«„) > 0 be eummcibU in the m-dimensioned spewe 
and let 


(55) vih , ■ • ■, U = f r e-'*^“--<'-*"«(®,, ..., X,.) d*! ... d*,,. 

*^90 


• Although the author believes that this lemma is almost clasaioal, a proof is given owing 
to lack of reference. 



DISTHIBUTION.S OP MK.VK AND VAKIANf’K 


l.’i 


■f/ ^(^ 1 ) • • ■ ) fm) ts summahle in the m-dimcnsional spare, (hen 


(56) w(a:i, ■ • a:„) 


(2^) 


ir-l' U 


Im) (hi 


thm, 


PnooF: Except for a constant factor the function n(ji, may in* 

regarded as a probability density function. Hence by the well-known invinsion 
formula of (55), 


(57) 




*1 ^ni) dxi * *' dXif, 


a,^x,£b, (i“li "im) 


1 

(27r)- 



e(b , (hi • • • JU . 


Now u(xi, ■ ■ ■ , Xm) is almost everywhere the symmetric derivative' of the inter¬ 
val function in the left-hand side of (57); 


’^( 2 : 1 , •••, = lim—Jy- /■•■/ «( 2 /i, • • •, yJ dy, • ■ • . 

Hence 



Owing to dominated convergence the order of the limit sign and the iuh'gration 
sign m (58) may be inverted; Hence (56) i.s true. 

Lemma 11. IPe have 


( 59 ) f ^ 7 ', 

u* \0 

Proof; The Fourier transform of the function in the right-hand .side of (591 is 

IT [\'‘rr - lt|)d( = (1 - co.sr«). 

Hence (59) follows from (50). 

Lemma 12. 


I s(?i -f- ■ •' -f t„)* I < 

Proof. As (60) is true for k = 1, let us us,sume, for induction, that it In true 
for 1, 2, • • • , k. Then, by symmetry, 

«(fi + ■■■ + = n^{fi(fi -+•••■ + == n x; 

r«0 V / 



16 


1'. L. HW 


where C/ = fj + ■ • ■ + . Siiioe «(^|) — 0, \vf havt* 

On the hypotheses of induction we have ((?'*' j < Ak(n ~~ < 

A*n''*“^*i3jL_r. Hence 

+ ■•■ + ?nr'i < r < 

Therefore the induction is complete. 

3. Elementary Proof of Theorem 1. 2.1 Wc have tkfinetl 

(61) F{x) = PriVnI ^ a:j, i(x) = iy/gj Z « 
with the characteristic functions 

(62) f ( t ) = |p (^n)) ’ 

Following Berry” we use the equation 

(63) r (F(k) - <t(»))e^'*(ic « , 

J_ao W 

Let 4>{it) be the polynomial in (34), and let us define 4'(x) as the function ob¬ 
tained from t/'('!'i) through the replacement of each power (it)' by ( —1)‘‘1'''’(^). 

Integration by parts shows (-1)’“‘ / e"V'\i) dx » (tt)'‘“V(0i whence 

(64) r dx «. 

•*-•0 —it 

From (63) and (64) we obtain 

(65) r lF(x) - $(x) - Sf(x))6'‘*da: « MjljeML±M] . 

The function 4'(x) defined here is precisely the 4^(x) appearing in (5) under 
Theorem 1. Our task is to prove that 

( 66 ) I F(x) - $(i) - ^(x) I :< - . 

Following Berry” we replace x by x + a in (66), getting 

/ (P(x -I- a) - <f(x -f o) - ^(x + o)}e"* dx 

(67) • 

= - y(t)(i + mil 

-ft 


(B), p. 127, Equation (23), 
” (B), p. 127. 



DISTRIBUTIONS OF MEAN AND VARIANCE 


1 / 


multiply both sides of (67) by T — j f | and integrate with respect to I inf-1\ 7* * ■ 
2 f -cos_^ + a) — ^>(a: + a) — '^(x + a)) dx 

J^oo ^ 


L 


(T- Ul)e;''TO - v’fOfl + '/'Oil 11 


d( 


the reversion of order of integration involved is obviously justiRahle, Hence 


( 68 ) 


c- 


— cos Tx 


(F(x + a) — ^ix + o) — 'I'(x + a)) dx\ 


< 1 


't 


1/(0 - v^(on +m)\ 


til. 


2.2. When in particular k = ^, (68) becomes 


(69) P- 

' pe 


— coa Tx 


- [F(x + a) — $(a: + n)) dx 


<Tn-i 


1/(0 " (^(Ol 

i 


dL 


If we choose a to be the a in (50), the leftrhond side of (09) is not less than 




— cos a: 


dx — 


y^Lu.b.in*) 


*(j;) i ^ 


On the other hand, taking T 
not greater than 


as in (36) the right-hand side of iriiO w 

F» 


A f di =* A. 

Jo 

Hence 

(70) TS js fj' di - ,r| < yl. 

Now the left-hand side of (70), as a function of TS, is positive and iuereasing for 
sufficiently large TS, and becomes infinite as TS<x>. Hence (70) implies that 
TS < A, i.e. 


I.u.b.|f(»)-i(.)|<^, 

giving Theorem 2. 

2J. Coming back to the general case, we see that the function <}f(x) + ^(i) 
has abounded derivative; | $'(i) -f- '^''(x) | < , and also has all the prn]x*rtiftH 

of the function Fi(x) in Lemma 8. On choosing a in (69) to be the a in (51) 
w'e obtain 



18 


I‘. I.. HSf 


m - v'ft)!! *f i(m\\ 


dt, 


(71) ' 

where 

5 = Qk l.u.b. 1 Fix) - Mx) - nx) i . 
us take T = Vn)*"’ with -4* in accordant with t34S. 1’hcn 

y p/(0 -y(0(l + ^ (i0!l^^ 

Jo ( 

r*" 

« / + Qtn‘'‘ 'W , - Ji 4* /» 

i/n " 


(72) 


say. 


By (34) we have 

(73) Jk < Qk r (t*~‘ + • • • + t"' ■')«“''* - Qk. 

Jn 


Also, 

(74) 




■ dh 


The second term m the right-hand side of (74) is evidently <Qk. 'riie first 
term does not exceed 


(75) (3*n^'*“”Tl.u.b.jp(t)i\ 

• fco* 


At this step we make use of the non-singularity of Pix) anti apply I^'mma 7 
for m = 1. We have 

l.u.b, |p(t)l = e"**, 

Hence (75) does not exceed < Qk . We have therefore 


(76) n ja jf"‘ -- —J --- dx - t| ^ Q*, T - Qfcn*'*"’*. 

Arguing with (76) as we did with (70) we conclude that 
luh.\F{x)-Hz)-n^)\<^^ 

(72) is valid for T > 1. If T < 1, we have only to suppress the tcnn Jt. Hence 
Theorem 1 is proved. 


4. Proof of Theorem 3 and Theorem 4. 3.1. In connection with the random 
variables (1), we assume that fin < «> for some integer A: > 3 and define 


7 =■ i t (t, - j)’, a{z ). pr ^ A 

^ I Vat - 1 i 


(77) 



DISTRIBUTIONS OF MEAN AND VAIUANf'E 


ID 


Now, 


where 


(78) 

Vn Va< — 1 

Hence 


(79) 

G(z) = Pr{X - 

with 


(80) 

1 

^ \/n(on 


Let W be the probability function of the distribution of the random point 
(Jf, Y) andf((i, ti) be the characteristic function: 

(81) T7(;S) = Pt{(,X, F)«)S1 for every Borel set jS in Tfj, 

( 82 ) = ;;?»)}' 

(83) p{luU)^ r 

Let ffi(z) be the distribution function of X. Then 

(84) G(z) - Gt(z) = / I dW = K(z), say. 

Let 




K.(z) = j I e-"'” dW. 


(85) 

*<*i*+Xv* 

If we define (for fixed z) the function G(x, y) by 

(86) Q{x, rj) = e if z < X < z Xy*, G(x, y) « 0 otherwise, 
then 


(87) 
Letting 

( 88 ) 


K.iz) = f rO(x,y) dW, 

C £ v)dxdy:^ gih, t,), 



20 


V. U HBTT 


we replace a: by a: - m in the integral and get 


(89) 


f f — M, y) dzdy , /j). 


Multiplying both sidea by ~—and integrating with rMpecl to u we 

u* 

obtain, with the help of (59), Lemma 11, 

1 ~ COB Tu , 


(90) 


f f dy [ 0{x ~ M. y) 

•“10 ^“•to tA 


du 


ir{T — 1 tj 1 )ffC<i, tj) if I h 1 r, 
0 if}til>T; 


the reversion of order of integration in the left-hand aide is obviously justifiable, 
By Lemma 9 the right-hand side of (90) is eummable in the whole plane of 
(h, it)- Hence, by Lemma 10, 


£ 


1 — cofl Tu . 

- 5 - G(j: - u, y) du 




(91) 


^ / / (r -lb|)ff(h,<i)e""^’‘’''dt,dti. 


|(,|:s:r 


If we integrate both sides with respect to the probability function B'', we obteun, 
on reversing the order of integration, 


(92) 


j_ -- J f v) dl^ 




|<,|£r 


By (86) and (87), 
(93) 

Hence 


f f 0(x~ u, v) dW = K,(u + z). 


(94) 


jf ^ ■K^«(m z) du — j j (r — I fi|)(7(£i, k)f{ti ,li)diidk. 


Miisr 


We now take the functions 


vih , £j) = ® 


—l(l|+<5+2p(ltl) 


(95) 



DISTRIBUTIONS OF MEAN AND VARIANCE 


21 


and , ik) as in (35), -where 

Since the condition 04 — 1 — aj 0 is assumed in Theorem 3 and implied in 
Theorem 4, -we have I p I < 1 . Let 


■»(».!')-27 71-"7 ' 




and let yix, y) be the function obtained from \piiii , tk) through the replacement 
of each po-wer (iti)" (flj)’'* by y) = (-I)'*"*'* 

dx 'dy * 

Since 

(98) w(x, y) = i , U) di^dk , 


we have 

(99) w,,y,(x, y) = 

whence, by Fourier inversion, 


C C , t,) dudh, 


(100) (i:«i)'‘(iij)'V(«i ,U)= r r e"‘*-'*“''ip,„.(®, y) dxdy. 

«L-aC 

From the definition of 7 ( 1 , y) it follows therefore 

I” /] y) Ua: dy « <f>{k .0(1 + 


{w(z, y) + 7 (x, y) j dx dy a ip(fi ,0(1+ , *01 ■ 


A comparison of (101) with f f dH^ = /(f,, 0 shows that (94) will 

remain true if K,(u) be replaced by 

(102) J J e (tp(j, y) 4- y(x^ y)) dxdy >= L,{u), say, 

u <*^ U+Xv* 

and f{k , h) be replaced by ^.(J,, fj) [ 1 + i/'(ft,, iV,)). Hence 

I" {K.{u + z) - L.iu + z)\du 


- l'^l)<7(ii.O(/(li. 


fill, 0(1 + Hiti , lOJ} dhtlk . 



22 


p. L. Hstr 


Let also 

(104) H{z) = j j {u)(x,y)+y(x,}/)]dxdi/, 

»-Xl/* £ I 

Bi{z) = j j |w(a:, y) 4- yix, y)} dxdy, 

(106) L{z) = H{z) - Hiiz) ^ j j (ui(a:, y) + y(x, y)j dxdy. 

3.2. We now consider the particular case ^ » 3 and prove Theorem 3. For 
fc = 3 we have ^ m y ^ Q and so 


=“ f j w{x,y)dxdy, 




■^i(z) ~ j j “»(», v)dxdy 4>(s), 

«^JI 

m = H{z) - H.(«), 


(107) f t 

L,(z) = j j e~***w(x, y) dx dy, 

(<x£<'fX|r* 

1 — COB !r« , , V 

-(^«(« + *) - Lt(u +x)} du 

(108) ^ 

~ iir If ~ 1 1 f») (/(^i) U) — 9(/i, t*) jdii dij . 

1‘iisr 

Now 

K,{u) - L.(m) = {©(u) - $(u)) _ {^r(u) _ $(u)j _ 

— (iTCu) — iir,(u)} + (L(u) — L,(u)), 

0 S m - m - £ «■*'■ * /£'’ .-‘■'■■‘-•'m.-,,,. ^ 

^ , 
|c,., ^ ^ 

0 S K(„) - < „(y.) ^ 

0 < L(u) - L.(u) < At. 



distbibutions of meak and Variance 


Zi 


Hence 


L - -+ A)) du 

(109) =0rL, + --^ __1 _ 

1 («i - 1) Vn VnVloi -])(]- pS; 

Kilrfr 


It is easy to verify that 

Tn”" *” W ““ tl- «u.n,„, „ 


TS 


^ I ~ 4 < at Le+J ( «« Y'\ 

'' 1 Vn \ai “■ 1 — a]/ J 


// . ia) - vj(li, I,) j 

l'i|s:r,|(,|sr 


By Lemma 9 (li) we have 

^ If \g(ii,h)\dlidt 2 

l‘i|sr,|< 5 |>r 


If \u(h, li)ldlidl3 , 

l I . t ^ tm , . . . 


IU|ar.|(,|>a’ 


( 111 ) 


Hence 


( 112 ) 


<at [[ .2, A , x'liil , x*|(jA , 

(<ilsr,|i,|>j> ' I ' * *’ / 


( r^* 1 

7'fi<3 / A ~ cos u 




dll 


<AlaiTe + 


/,_jttB_ 

U -1 - 


) a/i '/< , 

A. 4- X n ^ ■I' , X*'/‘^! 

I V« + ^ . + j 



*■- L. ilfiu 


By Ivemma 9 (i) with k = 3 ;ve have 

<113) // ff , 


y] dll dll 


By (37) under Lemma 3 




I-'" *’1 forji,! <^(iji/)Vn 

with * /Sjf 

(US) ^ ^ -^-1 ^ (cT^ip / 


We now take 
(U6) 


A* 

~ («t4 -1)*’ " i. 


(U6) = ai\' ,- 

8\ -J Vn, 

the A coinciding with that in (114). Then 

(117) ~ sc;—■— 

= Vn . A( 


4(j^:Lg!K^ ^ ^0. -- 1 -,,») 

(118) («4 ~ l)/3, 


> 4h~J^,)Wn „ 


Hence (114) is true for U J < y 
obtain I M _ X 




' " " UU < r and j 4 I < r TWn. fi ■ . 

^ I. Using this fact on (I13) we 

^ If Iffllf ~ pldhcU, 

< 1 /”° y" / 

«* Vn J-„ X,^ ~ Ip ^ ^ I tj p| ^ 

<• \ 1 * 

* Vn + A) (izr^t 

_ AT>i ^ ^ P) 

■\/ne — 1 ) + fii(oti — - 1 

^ ^ __ («4 ~ 1 - «’)*« 

nVe Voi - 1 + p^(^an - 1)») — _1 

< («4 - 1 , - «5)‘^* 

n\4(^7^irY'ir^ip ■ 


( 119 ) 



DISTEIBUTIONS OF MEAN AND VARrANCE 


25 


Substituting in (112), setting c = (atT) ^ and using (116) wc obtain after some 
easy reduction 


( 120 ) 


ri 

Jo 


— cos u 


u‘ 


du ~ T 


< A ^1 + ^n(«< — 1) {»(«< — 1 — orj)) (n(c 3 t< - i — af)} _ * 

If /i > (oi — 1 — aly^oit, then the right-hand side of (120) is ;< A, and so, 
arguing with (120), as we did with (70), we obtain 


(121) i.u.b. I e(„) - t(u) I < f - (;7rfr:s)' 

For n < (at ~ 1 — ixl)~''oit, however, the right-hand side of (121) > ^(04 ™ 
1 — a 5 )''^Qi« > A and ( 121 ) becomes a triviality. Hence Theorem 3 is proved. 
3.3. To prove Theorem 4, we start again with the identity (103). We have 

K,(u) - L,(u) = ((?(u) - mu)} - {(?j(u) - //i(u)) 

( 122 ) 

- {i!:(u) - /c.(u)i -f {i(u) - /.,(u)i. 

(123) 0 < K{u) - K,iu) < ee(K“) < (2*6 by Lemma 12, 

(124) 0 ^ L{u) — L,(u) < < / / J/) -f l7('5t J/) |) d* dj/ < 


Let us show that 


(126) I G,{v) ~ m(u) I < 

The function X = ^ ) has the same structure as •\/ji ^ (with 

{oii — 1) l(^, ~ 1) playing the role of t;)j hence, by Theorem 1, there exists 
an asymptotic expansion of the distribution function (?*(«). We shall see that 
the terms of this asymptotic expansion are precisely //i(n), whence (125) follows 
from Theorem 1. 

It is obvious tha,t for the polynomial V'(i(i, ih) in (36) <p{U, 0) coincides with 

^ (34). Hence the terras of the asymptotic expansion 
of Gi{u) are the inversion of e “ j 1 -{- 0)} viz. 

(126) 4>( m ) e 0) dL 

On the other hand, by (104), 


(127) Ili(u) ~ ^(u) -h f dz f y(x, y) dy, 

*^00 *^«o 

and by (101) with (2 = 0, 


(128) 






yix, y) dy = 


e'-*‘V(f(, 0). 



20 


p. h. list' 


Inversion of (118) gives 


<**•« 


which establishes the equality of //i(u) and ( 12 fi). 
Using (122), (123), (124), (125) on (103) we get 


l_ ^+ 2) - //(w 4- 2)) du » AiT (t + ^,,1. 


+ ©T j" / 1 giti , <j) l'l/(ti, U) — v>(b , fi)(l + f(»<i 1 tf»)l I dhdk . 

If we expand 

(131) H{u) = JJ {w{x, y) + y(x, y)} dx dj/ 

in powers of n~* up to and including the term the remainder is obviously 

Atn"''*"®*, Hence 

(132) E(u} = <l>(ix) + x(w) + 

where $(ii) + x(^) ia the group of terms of the Taylor expansion of (131) in 
powers of n“* up to and including the term n"*'*"”. From (130) and (132) wo get 


f I — [a(u + z) ~ ^(u + z) - x(ti + z) 1 






+ - 41 , 


where 


(134) I — T f f I ff(ti , ti) I '[/(fi, k) — ¥>(fi, ft) {1 + V'(ifi) ift) 11 df 1 d/s. 
|<,|sr 

We are going to prove that the function xii^) here defined satisfies all the 
requirements of the function x(u) in Theorem 4. The structure of x(w) an¬ 
nounced in Theorem 4 is easily verifiable. It remains to prove the inequalities 
(15) and (10) satisfied by 

I 0{u) - ^u) - x(w) I . 

It is obvious that the function $(«) + x(m) has all the properties of the 
function Fi{u) in Lemma 8, having a bounded derivative | ^'{U) + x'(^^) I Qt • 
Hence, on taking s in (133) to be the number a in (51), the left-hand side of (133) 
does not exceed 

Q,n (z j[ dM - sr) , 5 = Q*l,u.b. 1 Giu) - Hu) - x(u) |. 



DISTRIBUTIONS OF MEAN AND VARUNCE 


27 


Hence 

( 135 ) n (3 i e. !■ (, + + q./. 

In order to appraise I we recall (36) under lAimraa 3 (replacing therein each 
/3 a, by the larger number /9*iy3w, and merging the latter into Qk) 

1/(^1, h) — ¥>(ii, ti){l + , itj)| I < -Mji'i {2^(l^r + • • • 

(136) ’ 

for 


(137) I i < Qk-y/n. 

Put T = (Qi\/n)\ with Q* here coinciding with that in (137) and then (136) 
is valid for | | < and | | < . Write 

// +5r 11 +T 11 + + 

|(i| sr,ii,|> rin t'' 1< H i)r 

|<,|s:rin 

By Lemma 9 (i), 


^^38) Ii < ^//l/ - v^d -f 

whence, by (136) 


Ti < 


(139) 


QkT 

JL„ \izi 


+ + ii,r*-”)) 

(<(+<!)/» 


dlidli < 


Q,r 


By Lemma 9 (iii) we have 


h<QkT 


II ..,.w{ 


+ 


{|/((i, ta) 1 + <p{ii , <i) 11 + Hit, , Ih) l! di, dh 


Obviously, 

(140) 




On the assumption of non-singularity of P(a:) we have, by U'mma 7, 

l.u.b |/((A,t,)|^ l.u.b. \v(A-, -t-V 


(141) 


l.u.b. 


^(vn' 


t II ^ 





28 


p, t. nsu 


Hence 


Q* p c/i, 

(<i|i:r,|(j|>J'i" 

„ /n' • n**'**” 


For li we have | h | > = Qk\/n, and &o I^emma 7 is applicat>le to fa in the 

same manner as to h . Usinp; T^mma D (i) on the factor ] gih , 1%) ! wp got 


Combining (135), (138), (139), (142), (143) we obtain 


^ 4-^’’ 4- 

' fi/ik 


Putting € = "DF-'nnaih wo got, as the last term in (144) is < , 

Tg 

n(z - ^) < Q, + Q,n"'(- * - + .‘-) 

\ Jo J — Hk-T Hk'k “ Jjin-S )J ' 

If 4 < fc < C, Ave take J = fc — 2 and get 

(' f -') s e. + e. + 1 ) < «.. 


Hence, by the argument following (70), 

givmg (15). n h > 7, we t»ko 1 - ~^j go, 

TS (3 r ^-Zl^du - + _ J „ 

\ Ja U* y 1 -icfc I A -r ^(*^1,/,((*+,„ J 

TT ^ ' 


Hence 


l.u.b.l(?(u) - $(u) — x(«)i ^ 5*™ „ 


giving (16). Therefore Theorem 4 is proved. 

04 - 1 — aj = 0. If 0(4 — 1 — «i = 0, then there is unit probability 
that f, assumes exactly two values; 

Pr{{< « a) = p, Pr|{, = 6 ) = 


S) p + y •= 1. 



DISTKIBUTIOXS OF MEAN AND VAIUANCE 


Let f, = 1 with probability p and = 0 with piol)ability q. Then - h •!' 

(a - l>)f ., 1 ) = {a ~ hY ~ ~ f)^ Hence it i.s Hufficiont to considci the 

n 

variable - ^ (f; — = tj. Letting Sf, = ?• = np + ■y/npq A'we hstve jd 

Tt 

r - - = npq + (q - p)VnpqX ~ pqX , Wn now coii.sidcr two ilifitinc! 

7l> 

Case {i). p 7^ q. Here 

= Pr\(X + eVi)' > c‘n - 2\c\Vh\. 

' ' 2vp/ 

Thus F(z) = 1 if z > ^ I c ( \/?i< If 2 < 11 c I Vn, then 

F{z) = Pr{X < -cn- (c'n - 2 | c j Vj“i2)‘1 

4- PrlJ > ~c\/n + (c^n ~ 2\c \ L /'Vat'. 

To the random variable X Theorem 2 can be applied. SujipoKt; timt c ■, (1, 
then, by Tchebycheff’s inequality, 

By Theorem 2, 

Fiiz) = Pr{X < -cn - (c^'n. - 2\c\\/nzY] 


= $( 2 ) 4 


Hence 


Vnw ' 


(146) I FI,) - 4(2) I < .4 ‘ I 

[Vnpq ^/n\p-q\ n(p - qyj ^ 

The same inequahty holds also for c > 0, 

Casein), p = q = 1/2. Here th = i(n - r); hence 


(146) Prj,. > -^1 = Prir < z] ^ 4 -J. 

ThexQ is no asymptotic expansion for the distribution function of . {}4e 
(C), p. 83.) 



SAMPLING INSPECTION PLANS FOR CONTINUOUS PRODUCTION 
WHICH INSURE A PRESCRIBED LIMIT ON THE OUTGOING QUALITY 

A. Wald and J. Wolkowitz 
Columbia Unimdly 

1. Introduction. Thia paper diacuases several plans ft)r sampling insimrlion of 
manufactured articles which are produced by a continuous prcKluclion proem, 
the plans being designed to insure that the long-run proportion of defectives 
shall not exceed a prescribed limit. The plana are applicable to articlea which 
can be classified as “defective" or “non-defective" and which are submitted for 
inspection either continuously or in lots. In Section 2 the notions of "average 
outgoing quality limit" and “local stability" are diacusweil. The valuable con¬ 
cept of average outgoing quality limit for lot inspection is due to Dtxige and 
Romig [4], and that for inspection of continuous production to Dodge [1). Sec¬ 
tion 3 contains a description of a simple inspection plan (SPA) applicable to 
to continuous production and a proof that the plan will insure a pre-scribed 
average outgoing quality limit. Section 4 contains a proof tliat tliis inaijcctioa 
plan also has the important property that it requires minimum inspection when 
the production process is in statistical control. In Section 5 is contained the 
description of a general class of plans which possess both these important prope^ 
ties. 

The problem of adaptmg SPA to the case when the articles arc submitted for 
inspection in lots instead of continuously, is treated in Section D. Some methods 
of achieving local stability are discussed in Section 7 and a specific plan is devel¬ 
oped there. Finally Section 8 discusses the relationship between the present 
work and that of the earlier and very interesting paper of H. F. Dodge [1], 
mentioned above. 

If a quick first reading is desired the reader may omit the second half of Section 
3 (which contains a proof of the fact that SPA guarantees the proscribed average 
outgoing quality limit) and the entire Section 4 except for its title (the proof of 
the statement made in the title of Section 4 occupies the whole section), 

2. Fundamental notions. In this paper we shall deal only mth a product 
whose units can be classified as "defective” or "non-defective." We shall 
assume that the units of the product are submitted for inspection continuously, 
except in Section 6, where we assume that they are submitted in lots. Through¬ 
out the paper we shall assume that the inspection process is non-destructive, 
that it invariably classifies correctly the units examined, and that defective units, 
when found, are replaced by non-defectives. By the "quality" of a seciuenco of 
units is meant the proportion of defectives in the sequence as produced. By the 

outgoing quality" (OQ) of a sequence is meant the proportion of dcfectiv'CS 
after whatever inspection scheme which is in use has been applied. If this 
scheme involves random sampling, then m general the OQ is a chance variable. 

30 



SAMPLING mSPKCTIOK PLANS 


31 


(It depends on the variations of random samplmK.) If the i)Q ronveififs to 
a constant pa with probability one as the number of units prwiueed inneases 
indefinitely, p« is called the “average outgoing quality” The 

when it exists is therefore the average quality, in the long run, of the punluef ion 
process after inspection. It is a function of both the production proress and the 
inspection scheme. These definitions are due to Dodge [Ij. 

The "average outgomg quality limit” (AOQL) is a number which is to dejienii 
only on the inspection scheme and not at all on the production procchs. Roughly 
speaking, at is a number, characteristic of an insiiection scheme, such that no 
matter what the variations or eccentricities of the production process, tin* AtK^ 
never exceeds it. For the purposes of this paioer \vc shall need the following 
precise definition: Let c. be zero or one according os the fth unit of the prcKluct, 
before application of the inspection scheme, is a non-defective or a defective, 
respectively. Let di have a similar definition after application of the in.sjM'ction 
scheme. (We note that if the fth item was inspected, then d, == 0; if the ith 
item was not inspected, then c, = d,.) The sequence c = ci, Cj, ■ ■ • , c.v , • ■ ■ , 
ad inf. characterizes the production process'. Tlie elements of d => dt, d*, • ■ • . 
ad inf. are in general chance variables. The number L is called the A(KiL if it 
is the smallest' number with the property that the probability i« W'ro thud 

ILra^sup > L, 
no matter what the sequence c. 

It should be noted that this definition of AOQL places no restrictions whaUw er 
on the production process, since all sequences c are admitted. It is Umi much 
to expect a production process to remain always in control; indeed, doubt as to 
whether statistical control always exists may cause a miuiufacturer to in.'*titutt* 
an inspection scheme. The inspection schemes which we shall give below will 
yield a specified AOQL no matter what the variations in production are. If 
these schemes are employed, then, even if Maxwell’s demon of gas theory fame 
were to transfer his activities to the production process, he would be unsuci^wful 
in an effort to cause the AOQL to be exceeded. A di.shonc‘st manufacturer might 
sometimes essay to do this. If we imposed restrictions on the scciuencti r and 


nrk infinite sequence to describe the production procena tlcRerve* a few word® 

What we consider in this paper are Bcliemes upplicnblo when the riutnlier of units pmducKd 
18 large and operate mathematically os if the production sequence were of infimte letiglh 
Naturally the latter la never the ease in actuality. However, the larger the number of 

mathematical model. While the present definition uses e.xplicitly the notion of an infinite 
sequence, such a commonplace statement as "the probobility is 1/2 that a coin will fall 
heads up uses this notion implicitly. It is also implicit in the intuitive meaning we wrilw 
to such a word as "average," which is m every day use. 

’ ^*®oult to see that such a number always exists, for it is Dm lower Imund of * 

bound) a^dXser^ 



32 


A. WALD AND J. WOLFOWIta 


determined the AOQL on that haaia, wc ivoulcl run thp fljingpr that the relative 
frequency of defects in the sequence of outgoing unita might exceed the AOQL if 
it happened that the actual sequence c did not patisfy the reatrictionH imposed. 

After we discuss below various possible sampling inspection plans which 
insure that the AOQL does not exceed a predeterrainetl vahic L, it will bo seen 
that for any given L> 0 there are many sampling inspection schemes which do 
this, To choose a particular sampling plan from among them tlie following 
considerations may be advanced: If two inspection plans and A' both insure 
the inequality AOQL < L and if for any sequence c the average number of 
inspections required by <5 is not greater than that re<iuired by iS' and if for some 
sequences c the average number of inspections required by nS’ is actually smaller 
than that required by S', then S may be considered, in general, a better inspec¬ 
tion plan than S'. However, the amount of inspection ro<iuired by a sampling 
plan is not always the only criterion for the selection of a proper sampling 
scheme. There may be also other features of a sampling plan which make it 
more or less desirable. We shall mention here one such feature, callerl "local 
stability," which will play a role in our discussions later. Consider the serjuence 
d obtained from the sequence c by applying a sampling inspection scheme. Even 
if the AOQL does not exceed L, it may still happen that there, will he many large 
segments of the sequence d within which the relath'c frequency of ones is con¬ 
siderably higher than L. For instance, it may happen that in the segment 
{di, ■■■ , d„) the relative frequency of ones is etpial to IL, in the segment 
(dm+i, • ■ •, dj«) the relative frequency is equal to \h, in the segment (dam+i, 

• ■ •, dsn) the relative frequency is again equal to and this is followed again 
by a segment of m elements where the relative frequency of ones is equal to JL, 
and so forth. If m is large, such a sequence d is not very desirable, since each 
second segment will contain too many defects. A sequence d is said to be not 
locally stable if there exists a large fixed integer m such that the relative frequency 
of ones in (dj+i, • ■ • , d*+m) is considerably greater than L for many integral 
values k. On the other hand, the sequence d is said to be locally stable if for 
any large m the relative frequency of ones in (dt+i , • • > , dt+«) is not substan¬ 
tially above L for nearly all integral values fc. Thas is clearly not a precise 
definition of “local stability," but merely an intuitive indication of what we want 
to understand by the term, since we did not define what we mean by "large m,” 
“many values of k,” "considerably above L," etc. A precise definition of local 
stability wiH not be needed in this paper, since it is not our intention to develop 
a complete theory for the choice of the sampling plan. The idea of local stability 
will be used in this paper merely for making it plausible that some schemes we 
sM consider behave reasonably in this respect. A similar idea, called “protec¬ 
tion against spotty quality," is ^cussed by Dodge [1]. A possible precise defini¬ 
tion of local stability could be given in terms of the frequency with which F(W) =■ 

^ d( {k being fixed) lies within given limits. 



SAMPMNQ INSPECTION PLANS 


3. A sampling inspection plan which insures a given AOQL no matter what 
the variations in the production process. Tiio only fpatun* of the* ssimiihtin 
(inspection) plan (SP) studied in this section and hereafter referred t(t as riPA 
which we shall consider here is that it insures the achie,venu‘ut of a 
AOQL. Considerations leading to a choice among several schemeH are [Hwf 
to later sections. 

For convenience, let f be the reciprocal of a positive integer. SPA calls fur 
alternating partial inspection and complete inspection. Partial inspection 
is performed by inspecting one element chosen at random from each of mi'rvmvc 


groups of j elements. 


Complete insiiection means the inspection of every 


element in the order of production SPA is completely defined when a rule 
is given for ending one kind of inspection and 1 beginning the other. 

It is clear that all SP need not be of the above class. Thus, for exami)lp, a 
scheme might consist of partial in-spection with various/’s employed in various 
sequences. We make no attempt in this paper to examine all possible Hchemes, 
For simplicity in practical operation, alternation of complete insjH'Ction and 
partial inspection with fixed / would seem reasonable. The DiKlge seheine il| 
is of this type. 

We shall also not discuss the question of a choice of the conatiint /, hut wdl 
assume that a particular value has been chosen for various reasons tuui is a datum 
of our problem. Reasons which might influence a manufacturer in hi« fliniee 
of/ could be contract specifications which impose a minimum on the amount of 
inspection, or psychological grounds to the same effect. The manufacturer 
may desire a certam miniraura amount of inspection in order to detect m«l» 
functioning of his production process. Also / eontrols local stability to wime 
extent. The consequences of a choice of / as they api>ear in the tlmory below 
may also play a role 

Returning to SPA, we begin with partial inspection. I/ot L he the, K[>eeiru‘d 

AOQL. Denote by k,, the number of groups of j units in which defertivea 

were found as the result of partial inspection from the. beginning of pnaluetion 
through the Wth unit. SPA is as follows: 

(a) Begin with partial inspection. 

(b) Begin full inspection whenever 




(c) Resume partial inspection ivhcn 

^ 

(d) Repeat the procedure. (It wUl be recalled that ilcfcctive units, when 
found, are always to be replaced with non-defectives.) 



34 


K. VfkhB AND J. WODKOWtTZ 


It is to be observed that in this plan the number of partial irwjx'etiunH increase 
without limit. For, while complete inspection is ^oing on, the value of kn 
remains constant, so that after a lon^ enouf^h period of comiilete insjwction the 
denominator N of the expression which defines c# will have increimed sufficiently 
for fiw to be not greater than L. On the other hand, complete in«iwotion may 
never occur. This will be the case if, tor example, no defectives or very few 
defectives are. produced. 

We shall now show that the AOQL of the above ,SP m L. We first note that, 


at N, cn can increase only by 


(H 


N 


Hence, for nufficiently large N, eif < 


L + e, where e > 0 may be arbitrarily small. 

Suppose now that the production process is subject to any variations whatso¬ 
ever, i.e.) the sequence 


c = Cl, C), - • ■ , C;^, • * • , ad inf. 


is any arbitrary sequence whatever (by their definition the c, are all zero or one). 
Our result is therefore proved if we show that, with probability one, 

(3,1) 

for this arbitrary c, and that for at least one c 
(3 2) lim Cm =“ L. 


Let SiN) be the number of groups of j units which have been partially in* 

spected through the Nih. unit. Define Xi as zero if in the tth partially inspected 
group a non-defective was found and as one if a defective was found. We have 

S(H) 

kjf = Xi. 


Since the number of times partial inspection takes place increases indefimtely, 
S{N) —» 00 as W —^ 00 . Also S{N) < fN < N. Let at be the serial number 
of the last unit in the yth partially inspected group. Then for all j the expected 
value E{x,) of x, is given by 


E{xj) = / 


C-J 


ri//)+i) 



at 

T 




(ct - di) 


Xj 



0 . 


We have, for all j 
(3 3) 
so that 



SAMPLING INSPECTION PLANS 


35 


Also from (3.3) it follows, since x, 
from a population of fixed number 
such that 


is the value of a binomial chaiu’c variabli* 



that there exists a pomtive eonsliinf 3 


(3 4) 



< /3 


where a'ix) is the variance of a chance variable, x. Now a theorem of Kolnue 
goroff (Kolmogoroff [2], Frdchet [3], p. 254) states; 

A sequence of chance variables with zero means and variances , «•?, ■ ■ • 
converges with probability one towards zero in the sense of C'l^saro if 


(3.5) 



converges. The inequality (3 4) permits ua to ap]ily this theorem to the w- 
quence of chance variables of which the jth (j = 1, 2, • ■ • ad inf,') i.s 

(rj-i]*/- £ i.). 

\L; J / 

since the series 2 “5 is well known to be convergent. We therefore olitiiin lhal, 

t—X t ’ 

with probability one, 



since the units which are fully inspected contriliuU* nothing to I’d,. Since 
S(JV) < W, the desired result (3.1) is a fortiori true. 

If c is such that all the c, are one, it is readily seen that (3.2) holds. If many 
(this adjective can be precisely defined) defective.s are produced, thia will sIko 
be the case. This completes the proof of the fact that the AOQL of SPA it. I 
no matter how capriciously the production prt)ceH.a may vary. 


4. WFen the production process is in statistical control, SPA requires rainimttm 
inspection. The production process is said to be in statistical control if there 
is a positive constant p < 1 such that, for every f, the prolmldlitv that n » I 
is p and 13 mdependent of the values taken by the other c’s, Wc shall w‘e tlmt 
if the process is in statistical control and if BPA is applied to it, the sjHTifted 
AOQL is guaranteed with a minimum amount of insjHH'titm. 

The number of units inspected through the Ath unit prcHiuced is 

(4.1) W = W- l^SXA). 

If the process is in statistical control we have, with probability one, 



36 


A. ■WALD AND 3. WOLFO'WETZ 


,1 

£ c. 

(4.2) lim --- « p 

A' 


by the strong law of large numbers. Shortly we shall prove the existence of a 
constant L* such thatj with probability one, 

y 

Ed. 

(4.3) lim ir* » 

JN 

Assume for the moment that this is so. Since it is only by inspection that de¬ 
fectives are removed, and the units selected for inspection arc in statistical con¬ 
trol like the original sequence, it follows that, with probability one, 


(4.4) 


lim 


Af 


= i (p ~ L*) « 1 
P 


L* 

P 


because, with probability one, 

y 


E(ci- d,) 


lim 


(-1 


AT 


p — L*. 


Inspection is therefore at a minimum when L* is at a raaNimum compatible 
ivith the specified AOQL. By (4.3) the latter means that 

(4.5) L* ^ L. 

SPA has been shown to guarantee this requirement. 'Ihc optimum situation 
from the pomt of idew of the amount of inspection would therefore, be to have 
L* = L, but this cannot always be achieved. The absolute minimum amount 
of inspection clearly is /, i.e., partial inspection exclusively. Consequently 
from (4.4) 


so that 


1 


V 


>S 


(4.6) L* < p(i _ 

Combining (4.6) and (4.6) we see that we have to consider three cases: 
Oase a. If 


(4 7) 

we have to show that 
(4.8) 

Case h. If 


P > 


L 

1 -/ 


L = L*. 


(4.9) 


P < 


L 

l-f 



SAMPLING INSPECTION PLANS 


37 


Tve have to show, by (4.4), that 


that is, 

(4.10) 

Case c. If 

(4.11) 

we have to show that 


1 - - = 
V 


L* = p(l - /) 

L 

I, = Z,* = p(l - /). 


Phoof of (4 8 ): We have already remarked in Hection 3 that in SPA partial 
inspection always recurs, but complete inspection need never occur. We .nhall 
show in a moment that (4.7) implies tjiat no matter how large an integer y 
is chosen, the probability of temporarily stoppin|; partial in.'^iKn’fiou for wniie 
iV > 7 is one. Assume that this is ,so. Choo.se an arbitrarily small poMtive 


e, and let 7 > 


(H . 


. For a sequence where complete and partial iii.^js'etion 


alternate infinitely many times let 

A ^ m, aj, • • • , ad inf. 

be the sequence of integers at which partial insi>ectit)n ends, and let 

B = 01 , P 2 D.d inf. 

be the sequence of integers at which complete inspection eiuls. Thi'ii, for a!! j, 

«J+1 > 0, > oc,. 

From the description of SPA it follows that, for all X > 7 which iM-long to either 
A or B, 

(4.13) I ey - Z, I < «. 

In Section 3 we proved 

(3.1) lim i X) “ 0 

with probability one. Since tis arbitrarily small it fallows that, with probabilPv 


Sd. 

lim 1=^ = L, 
iV 

(V in ^ or S} 



38 


A, -WAM AND J. WODFOWrrZ 


To complete tlie proof of (‘t.8) we have .still to ahow that L* existK and that the 
probability is one that complete inapeclion will occur intiiutely many times. 
First we prove that L* exists. 

As N increases during an interval of complete inspection, = X! dj 

1-1 


remains constant. 


Hence 


‘~N 


decrcaaes monntonioally. 


Kince for the ends 


of such intervals (4.14) holds, it follows that (4.14) holds iw X x and is a 
member of A, 5, or an interval (a,, fi,) for all j. 

Let iV —> 00 while always being in the interior of an interv'al (p,, a,nj,j « 
1,2, • ■ I ad inf., which contains o:/+i but not /3,. Ud A’* be the total number 
of units in these intervals through the fifth unit prcxluced. Let A\ and Aj be 
such that 


Then 


~ Ny <. N% < at,+i . 


Nt - Af‘= Ni - Ni, 

% 

Since the production process is in statistical control, we have, by the strong 
law of large numbers, 


(4.15) 


lim 


D{N) 

N* 


p(l - /) = p' 


with probability one. Let 5* be the general designation for xnimlmrs << in 
absolute value, so that all 5* are not the same. With probability one for almost 
all N, we have by (4.16) 


Dm 

~W 


p' + S* 


Dm 

'W 


= p'+ s*. 


Write 


iDm - Dm] r. 

(Ny - Wi) 

Now 

Dm 

Nt 


Hence 

(4.16) K{Ni - Ni) = 23*iVr + (p' + «*)(Wj - Ny). 


DjXy) + [Dm 

Nf + m - 


- Dm] Dm + lDiN,l^D(Ny)\ 

'W) NfTW " mr ""' 

^ (p' + S*)Nt + K(N t - Ny) 

wfTW- m 


« p' + 6 * 



SAMPLING INSPECTION PLANS 


HU 


Now suppose (4.3) does not hold. From the definition of AOQIj it follow*- flsaf 
for some ij > t there exist sequences (whose totality ha-s a positive pn^lt.'jltility i 
so that, for infinitely many Nj we have 


I)(N,) _ D(N0 + [Dm ~ Di NQ] < r .. 

Ni Ni + (Nj — Ni) 


For large enough Ni, from (4.14), 


i 

Ni 


w'ith probability one and hence, using (4.16) in (4.17) 


(4.18) 


Ni{L + 5*) + 25*Nt + (p' + d*KN, - N,) 

< LN, + L(N5 - A',) - -l-jA'j 


from which, using the fact that p' > L (from (4.7)), we get 

(4.19) m* + 2NU* + - M) < -4t,N, . 

((4.18) and (4.19) hold for the sequences for which (4.17) holds, exctqjt iKTliiips 
on a set of sequences whose probability is zero.) Since N* < -Vi and ; 6* >}, 

we have, on the other hand, 

^ NiS* + 2Nts* + - Ni) > ~ n{Nt - .Y.) 

(4.20) 

> —i'nNi — 47j(A7s — A'l) I’- -"-'tijA'i 


which contradicts (4.19) and proves the desired result ((4.3) and (4.K!), cxccjit 
that it remains to prove that, no matter how large y, the probability of 
rarily stopping partial inspection at some N > y is one. I/Ot ys > y bi> Homo 
integer at which partial inspection is going on. From (4.2) and (4.7) it wotlhl 
follow, if partial inspection never ceased on a set of secjuences with ]K*Hilivt* 
probability, that, on this set, with conditional probability one, for jV auflieii-ntly 
large and e sufficiently small. 


m ~ 7o) 


L 

1 -/ 


+ «, 


N k>,(l - /) 

N - 70 fN 


> L + (1 — f)t, 


Cn '> L 



+ 


(N - 7 o) (l -/)« 

At""" ■ 


Cat > L + 


(1 

2 


This contradiction proves that complete inspection is eventually resumeri and 
completes the proof of minimum inspection in Case a. 



40 


A WALD AKD J. WOLFOWnV- 


Proof of (4.10) • We .“shall prove that (4,9) iinplie.s that, with prohability uiie, 
complete inspection will cease, never to be rc.sumcd, For, from (4,lr)( and 
(4.9) it follows that for N sufficiently large and e .sufficiently small, 


(4 21) 

Hence, a foitioii, 
(4 22) 


® = 23' + a* < L - 2«. 


im 


N 


i <L - 


lim (e.v 

.V—*w \ 


0 . 


((4 21) and (4.22) hold with probability one.) 

(3 1) states that, with probability one, 

- ®) 

Hence for all A' sufficiently large, with probability one, 

Cv < L — 

i.e., with probability one complete inspection is never resumed. 

When (4 9) holds, therefore, with probability one and with a Iluito mimher of 
exceptions SPA will reciuire only partial in-siieetion. 

Proof of (4.1^): If p = and complete inspi'ctiun finally never re,sunu*s, 

L 

then (4 12) follows easily. If p = and partial and complete ins]K!Ctinti 

alternate infinitely many times, then the proof is similar to that of f4.h) and ia 
therefore omitted. In either case the desired result follows 

6 . A das? of SP all of which insure both a given AOQL and miaitnum inspec* 
tion. Let the definition of SPA be modified in tlie following particulars: 

(b) Begin full inspection whenever 


, _ ‘"G ~ 0 

a;, 


> L + 0(.V). 


(c) Resume partial inspection when 

Cft ^ L — \p{N), 

Let <I>{N) and ip{N) be such that 

-HN) < <l>iN) 
lim <t,{N) = lim ^(N) = 0. 




pPA corresponds to the case 4‘{N) s f (Af) = 0.) Then all the SP of this class 
ave the property that the AOQL is L and that inspection is at a minimum in 



SAMPLING INSPECTION PLANS 


41 


thp sensp of Hection 4. Tht proofs aro essentially the same as those for SPA 
and hence will he omitted. 


6 . The inspection plans of Section 5 can also be applied to lot inspection. 
We shall carry on the discussion of this section in terms of SPA, hut the results 
apply to all the rnemlier.s of the elas,s of plan.s de.scriln'd in Section 5. t\V shall 
show that SPA can also he applied when the produet is sulnuitted for in.sih'ction 
in lc)t.s. Although we ius.sumed previously that the units of the ]iroduet are 
arranged in order of production, the results obtained for SPA lemain valid for 
any arbitrary arrangement of the units. If the product is sulmiitted in lots we 
may arrange the unit.s as follows: Let li, fa, ■ • * , etc. he the .succc.s.sivt! lots in 
the order of their submiasion for instiection. Within each lot uc consider the 
units arranged in the order in which they are chosen for inspection. In this ivay 
we hat'c arranged all units in an ordered seciuouce and tlie in.spection can he 
ajrplicfl as described before. Thus, Ave start witlv partial inspection, i.c., we 


take out groups of j elements in h and inspect one unit (selected at random) 

from each of these groups. When Ca- > L, we start complete in.s]>cction and 
revert to partial inspection as soon os cy < L. When the units in h arc used 
up in the process of inspection, we continue, using the units of h , etc. 


If it is found inconvenient to take out a group of j units and then to select 
one unit for inspection, we could modify the sampling inspection plan as follows: 


Instead of taking out a group of j. units and then selecting at random one unit 

from it, we select at random one unit from the uninspected part of the lot and 
look upon this unit as the unit selected at random from a hypothetical group of 


j: units. Thus we can proceed exactly as before, except that we have to keep in 
mind that with each unit inspected under "partial inspection” we have used 
up another set of ^ — 1 units Thus, as soon as times the number of 


units inspected under "partial inspection” becomes equal to or greater than the 
number of units in the uninspected part of the lot, the inspection of that lot is 
already terminated, and we have to start using the units of the next lot. The 
inconvenience caused by the necessity of keeping track of the number of units 
inspected under "partial insiiection” and of the number of units in the unin- 
.spcctcd part of the lot can be eliminated by further modifying the insjxiction 
plan as follows: Instead of beginning complete inspection os soon as ejy > L, 
we continue “partial inspection” until — L is so large that complete 

inspection of all the units of the lot not yet used up has to be made in order to 
bring down to L at the end of the lot. This leads to the following samplmg 
procedure, to be known as SPB: Let Nq be the number of units in the lot, let 
Nl be the serial number of the last unit in the preceding lot, and let E(Ni.) = 



42 


A. WALD AND J. WOLFOWITZ 


NlEki, = - L) be the “excess” earned over from the preceding lot. 

For simplicity assume that the following are all integers. 

LN^ = M 


= M* 


/A^o == N* 


and 


Wl) 
1 -/ 


= E*. 


The inspection procedure is then as follows: Inspect succe.ssive. units drawn at 
random until either 

(a) M* — E* defectives have been found in the first N' < N* units insiH'Cted. 


N' 

In this case inspect further an additional Na ~ y units and this t<‘rminaU‘s the 

inspection of the lot. The excess to be carried over to the next lot is thru zero. 
Or 

(b) N* units have been inspected and the number of defectives found is II < 
M* — E*. In this case the inspection of the lot is terminated and (he jiresent 
negative excess 


E{Ni + iVo) = [// - (M* ~ E*)] 

is carried over to the next lot. (The serial number of the last element in the 
present lot is Ni, + Na and 




+ 11 

Aft + A'^o 


Hence the present excess is 


{Ni + Afo)[e(A't+Fo) — L] = ATteyi. + H — LE\ ~ EN» 

J 

= NdfiN, - L) + // ~ M 

f 

= (// _ M* + E*], 

as given above.) 

We note an important property of SPB: The excess carried over from a pre- 
cedmg lot is never positive. 



S/tMrLING INSPECTION VLANS 


■la 


7. Possible modifications of the SP to achieve local stability. Although 
the sampling plains di-scussed in previous .sections are optimum in the ,*-cti-e that 
they guarantee the de.sired AOQL with u minimum of insiieetion when the 
production process is in .statistical control, they do not always hehnve very 
favorably a,s far as local stability i.s concerned, d’o make thi.'f iioint clear, 
consider the following example; Suppose that during a very long initial tinu* 
jieriod the produetion proocK.s function,s very well and the relativi* fretiueney 
of defectives produced is well below L. Thus, applying SI*A, say, r,v “ h will 
be considerably less than zero at the end of this period. Now siipijose that then 
the production process suddenly deteiiorates and the number of defeetivi's 
produced during the next period of tunc is considerably higher than L. In .“'jiite 
of that, complete inspection will not begin for ejuite .some time liecause f.v beeame 
80 small during the initial iieriod. Thus there will he a long segment in the .sc*- 
quence of outgoing units within which the relative frecpiency of dcfcctivc.s wilt 
be larger than the prescribed AOQL. Of course, this segment will be counter¬ 
balanced by other segments ivhcre the relative frecpiency of defeclivcs will hi' 
below the AOQL, so that the AOQL will not he violated. Neverthcle.sK, the 
occurrence of long segments with too many defectives, i.e., a lack of local sta¬ 
bility, is not desirable. 

It should be noted that, even though SPA was not designed to achieve con¬ 
siderable local stability, drastic lack of local stability cannot occur ivhen the 
production process la in statistical control and SPA is employed. In the example 
given above where the outgoing quality was not locally stable, it was assumed 
that there were variations in the production process. The existence of statistical 
control acts as an important stabilizing factor on the (luality. 

In this section we ivant to discuss several possible modifications of 8PA which 
will insure a greater degree of local stability. One auch modification ia the 
following: We choose a positive constant A and we define the excess K* for each 
value N as follows: E*(N) is equal to the excess Ii{N) as originally defined 
(= N[eir — L]) as long as for all N' < N, E{N') > —A. The dif¬ 
ference E*{N + 1) - E*{,N) = E{,N -hi) - E{N) for all A' for 
which E{N -b 1) - E{.N) > 0. If E{N -fl) - E{N) < 0, then -b 1) = 
max {E*{N) -b {E{N -b 1) — £^(A^)1, —A], In other words, with this mcKlifica- 
tion of the sampling inspection plan we set a lower bound — A for the excess. 
When the excess is positive we begin complete inspection, and revert to partial 
inspection when the excess becomes non-positive. Tlic ellcet of this is that, if 
the proportion of defectives produced becomes large, complete insiKHdion will 
not be delayed very long, although the proportion of defcctiv^es produced in the 
preceding period may have been considerably below L. It is dear that this 
modification of SPA does not increase the AOQL. However, the amount of 
inspection will be somewhat increased, especially when the quality of the product 
is less than or only slightly greater than L, If the constant A is large, the in¬ 
crease in the amount of inspection is only slight, but also the degree of local 
stability achieved is not very high. On the other hand, if A is small, the increase 



44 


A. WALD AND J. WOLPOWITZ 


in the fl/mount of inspection may be considerable, but a high degree of local 
stability is achieved Thus, the choice of A should be made so that a proper 
balance between local stability and amount of inspection is achieved. 

Modifying SPA by setting a lower limit for the excess has tlie disadvantage 
that the mathematical treatment of this case is involved. We shall, therefore, 
consider another modification of the inspection plan which will have largely the 
same effect, but whose matlieinatical treatment appears to be much vsimpler. A 
fixed positive integer Na is chosen and the inspection scheme is designed so that 
Et!, < 0 is assured. If E/,, is negative, we replace it by zero. In other words, 
no excess is carried over from the first segment of Mu units to the next segment of 
No units. Thus, the second segment of No units is treated exactly the same way 
as if it were the first segment, and this is repeated for each consecutive segment 
of No units This modification of SPA (the resulting plan is to be knowm aa 
SPC) has essentially the same effect as setting a lower bound for the excess. 
Agam it is clear that by this modification the AOQL is not increased, but the 
amount of inspection may be increased. The latter is particularly true when 
No is small, which corresponds to very high local stability reciuiremente. More 
efficient plans than SPC can probably be devised for this situation. 

Undoubtedly, there are many other possible modifications of the insiH’ction 
plan by which a greater degree of local stability can be achieved at the price of 
somewhat increased inspection. It is not the purpose of this papt'r to enumerate 
all these possibilities or to develop a theory as to which of them may iic con¬ 
sidered an optimum procedure. We shall restrict oui-selves to a discussion of the 
mathematical consequences of SPC. First we define it precisely. If it is to he 
applied to inspection of lots of size No then SPC is simply SPB with NiNi) 
and E* always zero. When applied to continuous production it will operate 

fM 

as follows: Assume for convenience that M = LNo , N* = {No , and j. “ 
are all integers. 

(a) Begin each segment of No units with partial inspection, i.e., insjiect one 
unit chosen at random from each successive group of ~ units. Continue partial 

inspection until one of the following events occurs: either 

(b) M* defectives are found. In this case begin complete inspection ivith the 
first unit Avhich follows the group in which the last of the M* dcfectii'ch was 
found and continue until the end of the segment of iVo units. 

or 

(bO N* groups of i units are partially inspected. 

(c) Repeat with the next segment of No units. 

Comparison with SPB shows that, in SPC, if (b) occurs earlier or at the same 
time as (b'), then E)^^ = 0, ivhile if (b') occurs before (b) we have < 0. 
In contradistinction to SPB, in SPC there is no carrying over of the excess. 

Let us determine the AOQ for SPC when the production process is in a state 



SAMPLING INSPECTION PLANS 


46 


of statistical control. Denote by p the probability that a unit firwhucrl will be 
defective. I.«t the chance variable U denote the number of defcctive.s found 
during partial inspection. The probability that II — i < M* i.s 

(f ) P'CI - P)"-'. 

H ^ M* always. We have, when H = i, 

E{No) - - LAr„, 

and hence 


HoCn. 


0^j)i 

r ' 


(1 — {) 

The AOQ is therefore multiplied by the expected value of H and is 

therefore 


■7»r ["' - S - •> (f)■’ 

The reduction from the original quality p to the, AOQ was achieved by iiispeeting 

a fraction of units which is - times the reduction in the frctiueney of defectives. 

Hence, with proliabhity one, the fraction of units insiwcted when the prmluction 
process is in statistical control is 

(7.2) 7 -1 - ^ + g' («• - i) (f) Ai - pr 


When p > 


L 

I ~f 


we see from Section 4 that the third term of the right memlKW 


of (7.2) represents the price paid in fraction of inspection above the niininiuiu in 


return for the local stability achieved. 


When p < 


L 

I 


the twlditUiiud iimjwC" 


tion is of course I — f. 

As Ao becomes larger, SPCl becomes more and more like SPA, and eon«t‘- 
quently the amount of inspection tends to the minimum. .-Vh jV# heeumes 
smaller, the degree of local stability achieved beemne.s higlier and must be 
paid for by an increasing amount of insixsction. An illustrative e.xaniple will be 
given in the no-xt section. It has already been iminted out (hat the mere exist" 
ence of statistical control implies a considerable amount of local stability even 
when SPA is applied. 



46 


A. WALD AND J. WOLFOWITZ 


The only practical difficulty which may arise in evaluatinp; 1h(' formulas in 
(7 1) and (7.2) might come from attempting to evaluate 

T' = g' {M* ~ i) p'(i - pr-\ 

For those values of the parameters which are likely to occur in u])plicii(ion, a 
good approximation to T' (exactly how good we shall not invc.stigatc here) is 
given by 


W*-l 

r = D (M* 


e 

t) 


-N*p 


L^*py 

i\ ■ '■ 


A table of T for integral values of M* from 2 to 10 and for integral values of N*p 
from 1 to 25 is given below. The computations were performed under the 
direction of Mr. Mortimer Spiegelman of the Metropolitan Life Iiwuranco 
Companyj to whom the authors are deeply obliged. 


Table of T 


L 




(M* - 1 ) 


-J - -■ 


!/• - 1 

1 

2 

3 

4 

s 

N 

6 

V 

s 

9 

10 

n 

11 

1 

1,10 

.54 

25 

.11 

.05 


.01 

.00 

,00 

( M ) 

.00 

(Kl 

2 

2.02 

1.22 

67 

.36 

.17 

.08 

.01 

,02 

.01 

,00 

,oo 

, tH ) 

3 

3.00 

2.08 

1 32 

78 

,44 


.12 

.06 

.03 

,01 

.01 

.00 

4 

4 00 

3.02 

2.13 

1,41 

.88 

.82 

.20 

.10 

.08 

.04 

.02 

.01 

5 

6.00 

4.01 

3.06 

2,20 

1.40 


,69 

.36 

.20 

.11 

.00 

.03 

6 

6.00 

6.00 

4 02 

3.08 

2.20 

1.67 

1 04 

.06 

.41 

.24 

.14 

.08 

7 

7.00 

0.00 

6.01 

4.03 

3.12 

2.31 

1.64 

1,12 

. 7.3 

.46 

.28 

.17 

S 

8 00 

7.00 

6.00 

6.01 

4 OS 

3 16 

2,37 

1.71 

1.19 

.79 

.61 

.32 

9 

9.00 

8 00 

7.00 

6 00 

5 02 


3.20 

2,43 

1.77 

1.25 

.85 

.56 

10 

10 00 

9 00 

8.00 

7,00 

6 01 


4 10 

3.24 

2 . 4H 

1.83 

1.31 

,01 

11 

11.00 

10 00 

9 00 

8 00 

7,00 

6.01 

5.05 

4.13 

3.28 

2 . 5.3 

1 89 

1 37 

12 

12 00 

11 00 

10.00 

9 00 

8.00 

7.01 

6.02 

6,07 

4.16 

3.32 

2.68 

1,95 

13 

13.00 

12.00 

11 00 

10,00 

9 00 

B3n!t] 

7 01 

6.03 

6 OH 

4.10 

3.,36 

a.Ki 

14 

14 00 

13 00 

12 00 

11,00 

10 00 

9 00 

8 00 

7 01 

6.04 

6.10 

4 . 2-2 

3.40 

15 

16.00 

14 00 

13 00 

12,00 

11.00 

10,00 

9.00 

8.01 

7 02 

6 06 

5.12 

4.26 


Qtf'r ^odgo (1] has proposed a very interesting 

SP for continuous production. The plan is defined by two constants i and / 
and may be described as follows: Begin with complete inspection of the units 
consecutively as produced and continue such inspection until i units in succea* 
«on are found non-defective. Thereafter inspect a fraction / of the units 
Continue partial inspection until a defect is found. Then start complete inspec- 

faon agam and continue until i units in succession are found non-defective, 
Kepeat the procedure, 

Dodge [1] derived formulas for determining the AOQL corresponding to any 




sampling inspection I’LANh 


47 


pair t and /, under the assumption that (lie production process is in a stafe of 
statistical control. DodRe’.s formulas for the AOQL aic not nccc.'-saiily valid 
if ive do not make this restriction on the production proce.«.«, i e., if ive admit 
that the prohability p that a unit will be defective may vary in any arliitrury 
way diirinp; the production prciceKs. This, of course, is not a criticism of flu> 
derivation of the formulas; it cannot be considered surprisiiiK lliat a formula is 
not valid under assum])tiona different from thow under which it was derivi'd. 
However, it i.s relevant to point out the fact that the DodKc SF doe.s not KUaraii* 
tee the, AOQL under all circumstances, so that care, must lie taken to ensure that 
certain reejuirement.s arc met. Exactly what thc.se recjuircnicnls arc is not 
known, statistical control is a sufficient eondition, but is proliably not necessary 
and could be weakened. It .seems likely to the authors that, if p varic.s only 
slowly (with N) with infrequent “jump.';,” tlie Dodire SP will produce rc-sirlts 
which will exceed the AOQL by little, if at all. Hut if the “jumps'' are numcr- 


Tahle of T 


M'-l 



(il- ~ i) 


i! 


((’onlivunl) 


if‘-1 


u 

U 

IS 

16 

17 

1 

.00 

,00 

.00 

.00 

,00 

2 

.00 

.00 

00 

,00 

,00 

3 

,00 

00 

,00 

00 

.00 

4 

,01 

.00 

,(K1 

no 

,00 

5 

.02 

01 

.(10 

,00 

,00 

0 

.01 

.02 

.01 

.01 

.00 

7 

,10 

.05 

.03 

,02 

.01 

8 

.20 

12 

07 

.04 

.02 

9 

.36 

.23 

14 

08 

.05 

10 

.61 

40 

.26 

,16 

.10 

11 

.97 

66 

44 

29 

,18 

12 

1.43 

1,02 

.71 

,48 

.32 

13 

2 00 

1,48 

1.07 

.75 

.52 

14 

2 68 

2 05 

1.64 

1.12 

.80 

15 

3 44 

2.72 

2.10 

1.59 

1.17 


!fp 


U 

w 

w 

11 

11 ! 

2. 

n 1 

JS 

.00 

.00 

.(H) 

,00 

,00 i 

00 

i 

.(HI ; 

(HI 

.00 

.00 

.00 

.00 

,()(), 

(H) 

.0(1 

(HI 

.00 

.00 

.(K) 

.00 

(10 1 

00 

.(HI 1 

,(HI 

.110 

.(K) 

.00 

0(1 

.(HI 

.00 

.00 ' 

(HI 

.00 

.00 

.(K) 

,0(1 

,(H1 ' 

.00 

.(HI 

(HI 

.00 

(10 

.00 

.(HI 

.0(1 i 

Oil 

.0(1 ; 

.IHI 

00 

.00 

.(K) 

.00 

.(HI ' 

.(HI 

.0(1 ' 

(HI 

01 

.01 

.00 

.00 

(H) 1 

.00 

.(HI 

(10 

.03 

.01 

(11 

.0(1 

.00! 

.00 

(HI 

(K) 

,00 

.03 

,02 

.0! 

.01 

.00 

.00 

,(KI 

.11 

.07 

.04 

.02 

0! I 

.01 

.(HI 

(HI 

20 

.13 

Oft 

05 

.03 I 

02 

.91 

.01 

.35 

.23 

.15 

.09 

,00 1 

.03 

.(12 

01 

.55 

.38 

.25 

16 

, . 10 1 

.07 

.94 

.02 

84 

.59 

.11 

,27 

,1H 

.12 

.07 



OU8 and appropriately spaced it is possible, to cxeecrl tht' .-UIQI, by substantial 
amounts, a.s the examiilc below will show. The Dodge jtlan was intended to 
serve os an aid to the detection and correction of nmlfuiictioiiiiig of the iirwiue- 
tion process and this use w ould tend to prevent the occurrence of such a phenome¬ 
non. Parenthetically, it should be remarked that tlie infonmvtiun obtained in 
the course, of inspection according to either the plans diseim*d in lids pajK*r or 
any reasonable sclieme should, if poasihle, he .sent at once to the pnalucing 
divisions for their guidance. 

An example to show that the AOCiL can he. exceeded can be eonslnuUett as 



48 


A, AVALD AND J. AVODFOWITZ 


follows; Let i = 54 and / = 0.1. Then aecordinp; to the p;raplnt of [1|, jHige 
272, the AOQL should be 0.02. Define a .scciucuee tjf 0(1 .miecessive units free 
of defectives a.s a segment of type 1, and n setiuence of 00 .s)ieeef.wiv(‘ riiiifs where 
the production proces.s as in statistical control with p - 0.1, as a .Sf‘gnu*iit of typo 
2. Suppose that the .sequence, of unit.s jirodueed ron.sist.s of .segmenl.s of types 
1 and 2 always alternating. Then it follow.s that the liot item iii.Hj>er(ed in a 
segment of type 2 is ahvay.s inspected on a partial insi>eetion bii.si.s-, We now 
assume that, unle.s.s the occurrence of a defective lia.s previou.sly (enuinuted 
partial inspection, the l.st, 11th, 21,st, 3lat, dlst, and Slat ileni.s in a segment 
of type 2 will be cho.scn for partial in-spection, and if the l.st item is found defec¬ 
tive, the entire segment of type 2 will lie cleared tif defectives. (Botli of thcHa 
assumptions favor the Dodge SP.) Then the .situation is a.s described in the 
following table: 


lat 

(1) 

Probability of first 
terminating partial 
inspection at 
each item 

.1 

(2) 

Expected nnmher of defec 
lives remaining in scg~ 
meat nf type S after 
partial inspection 
has been ter- 
minaled 

0 

(h 

(I) X (2) 
0 

nth 

(.9) (.1) = .09 

.9 

.081 

21st 

(.9)2(.l) = .081 

1.8 

. 1458 

3l6t 

(.9)»(.l) = .0729 

2.7 

.1068:1 

41st 

(.9)<(.l) = .06501 

3.6 

.23619f5 

Slst 

(.9)'>(.l) = .059049 

4.5 

.2657205 

Expected numter of defeetivet 

Prooabtlily that an entire left in a Bcgment of type S 

Segment of type $ will which has been inspected 

be partially inspected only partially 

I^oducl 

{.9)» = 

.531441 5.4 


2.8697814 


Sum = 3.7953279 

mu Ann- .IV r 3.7953279 

The AOQ IS therefore——— = .0316-1-, ivhile L = .02. 


It is therefore difficult to compare the Dodge plan with any of the plana de¬ 
scribed in this paper with respect to their effect on a production process not in 
statistical control. If the production process is in statistical control, then, m we 
have already seen, SPA requires minimum inspection (and, incidentally, because 
of the existence of statistical control, produces a fair degree of local stability), 
if, when Btatatical control exists, one requires both maintenance of a given 
AOQL and a higher degree of local stability than is produced by SPA, the rele¬ 
vant comparison is between the Dodge plan and SPC. Both will probably give 
good results as regards local stability, but it is not possible at present to make 



SAMPLING INSPECTION PLANS 


4!) 


these intuitive notions precise, as we have not given an exact definition of local 
stability. The following e.xample (in which statistical control i.s as-Minied) may 
not be unrepresentative of what the situation i.s with regard to tlu* aniount of 
inspection required. 


Fraction of product inspected under the Dodge plan and under SI’C when 
L » .045 / == .1 







THE EXPECTED VALUE AND VAPIANCE OF THE RECIPROCAL AND 
OTHER NEGATIVE POWERS OF A POSITIVE BERNOULLIAN 

VARUTE‘ 

By Frederick F. Stbi’han 
War Production Hoard, IF ashiigton 

1. Introduction. The expected value of the reciprocal of a Bcnroullian 
variate appears in certain problems of random sampling wherein hotli practical 
considerations and mathematical necessity make, zero an inadmissible* value, 
of the variate. This special condition excluding zero is necessary from a practical 
standpoint because statistics can not be calculated from an empty close. It i.s a 
necessary condition, in the mathematical sense, for the exiHV’ted valtu*, and 
variances involving it, to be finite. When subject to this condition the, Bernoul- 
lian variate will be designated the positive Remoullian variate. 

There appears to be no simple expression for the expected value of the reei{)- 
rocal such as there is for the expected value of positive inU'gral pemers of the 
positive Bernouilian variate. This paper presents in (l.'i) o factorial series, 
which can be computed conveniently to any desired number of terms by ineiins 
of the recursion relation (18). Upper and lower bounds on the n'lnairuler may 
be computed readily from (20), (21), (23), (24), ami (20) and the approximation 
may be improved by adding an estimate of the remainder taken betwis'n these 
bounds. A factorial series for the expected value of negative integral iwwens 
is given in (34). A factorial series for the expected value of tlio reciprocal of the 
positive hypergeOmetric variate is given in (53). Series for the variances follow 
directly from the series for expected values. 

A simple example of the sampling problems in which this exiKicted value 
appears is presented by the folloiving instance of estimates derived from samples 
of variable size: 

An infinite population consists of items of two kinds or classes, A and B, 
Lots of N items each are drawn at random. In such lots the number of items, 
x', that are of class A is an ordinary Bernouilian variate. Next, every lot 
composed entirely of items of class B is discarded. This excludes all lota for 
which x' = 0. From each remaming lot the N — x' items of class B are set 
aside, leaving a sample composed entirely of items of class A. The muubcr of 
such items, x, varies from sample to sample. It will be designated a po.sitive 
Bernouilian variate since a: = a:' if a;' > 0 and x does not exist if x' < 0, Finally, 
let there be associated with each item, in class A a particular value of a variable, 
y, the variance of which in A is cr’*. Then if the mean value of y i« computed for 
each sample, the error variance of such means is E{a/x) = ff*E{l/x). 

Instances similar to that just described occur in the design of sampling surveys 
from which statistics are to be obtained separately for each of .sei'cral classes 

'Developed from a section of a paper presented to the Washinglnii lueoliivg of 
the Institute of Mathematical Statistics on June 18, 1943, 

50 



DKHNOiniLIAS VAllIATK 


ru 


of the population, i.e., eaeh f^tatmtie ia to he computed from some part of the 
sample instead of all of it. They ah’o occur in certain samjilinp; inolilcms in 
which some of the items drawn for a sample turn out to he blanks 
A related problem concerninR flie errfn variance of tlu‘ pioportion of mali-s 
amonR infants born m any one year wtus eonsiclcted by (1. Hohlmami in a [laper 
on approximations to the esjM-ctcd value and statidard t'rror of a function [ll- 
His approach to the iiroblem was to expaml the function in a 'I'aylor mtick ami 
take the expeeled value of each term. 'I'he eondilions imd(*r whieli the n'Kultin;i; 
series converges wen* dcvelojM'd for C(>rfain functions of a Hcrnoulliim variate. 
Thepre.spiit pa[H'r provides a tliffermit and, in c-ertain respeets, sujierior aiiproacli 
to the problem employing a mcthcal due to Ktirliiig [2]. White the methevd is 
applied to tlu* «‘ciprocal and negative powers it is also appliculile to certain 
other functions of a BernouUian variate. 


2. The positive BeroouUian variate. I^et x be a random variate dermed by a 
BernouUian probability function .subject to tlie sjH'cial condition x > 0. The 
probafiility of .r in ri is 

( 1 ) 

where X and a are intcger-H, 1 < x < n, and 


( 2 ) 


/n\ _ n! 

\x/ a:l(n - x)V 


The probabilitii's p ami are eonstants, i) < p - I tj < \, 

The divisor 1 r/”' aristas from the etindition excluding rero. (Bolihnann 

omits this factor, assuming tliat q"* i.s negligible, an assumidinn tlmt i-s not 
always valid. In fact, q” r cxtensimi of tbw condition to c.xclude 

all values of j less than a s 5 H*citied constant will be considered in a later section. 

Throughout this pajH'r summation is understtssl to be from x -t- 1 to .r -s n 
unless it is sliowu otherwi.s<>. 


3. Expected values and moments. The e.\fM*cted vah»*s of x tuul its pewitive 
integral powers are 

(3) ^■ix) rtp/d q") 

(4) it'Cx’l (npq f f»V^ d '/"l 


and, in general 

(5) AXx’) « r.-tl ” 9") I “■ q-‘ ‘ 


where e, is (he fth moment about isero of an ordinary Heriioullian variate with 
the same n and p and the 25 art* the ISlirling numbers of llie second kind fsr*e 
Table 1), 


The momenta alrout Eix) arc .Himt-wliat more cumpUcated than'TT'f; ,<;otre- 

'"i »»’%i K 

I i! r I I 



52 


fuederick p. btephan 


spending moments of the ordinary 
variance 


(6) - -SC®))’) = 


Bemoullian variate. 

a 3 n 

Ttpq _ 

r^g" (1 


For example, the 


and the third moment 
(7) E{{x - ^(a:))’! - 


(T - g")* a" - g"K 


The moments about np, the first moment of an ordinary Bemoullian variate, 
are 


(8) i?{(3: - np)’l = (a. + (~l)-'(rtp)V)/(l “ 9") 


TABLE 1 


Stirling numbers of Ow. second kind, ©J 



1 

2 

3 

4 

6 j 

0 

1 

1 

0 

0 

0 

0 

0 

2 

1 

1 

0 

0 

0 

0 

3 

1 

3 

1 

0 

0 

0 

i 

1 

7 

6 

1 

0 

0 

5 

1 

15 

25 

U) 

1 

0 

6 

1 

31 

90 

0.5 

15 i 

1 

7 

1 

63 

301 

350 

MO ! 

21 

8 

1 

127 

900 

1,700 

1,050 i 

206 

9 

1 

255 

3,025 

7,770 

0,951 i 

2,046 

10 

1 

511 

9,330 

34,106 

42,525 

22,827 


where m is the ith moment, about the mean, of an ordinary Bemoullian variate 
with the same values n and p. 


The expected value of the reciprocal is 




(9) 


1 - g" 


-npg 


■ + 


1 1 / 
r2'‘<" 

+ •■• + 


l^g' 


+ 


+ ip« 
u 


This equation is not suitable for the computation of E{l/x) to a Batittfactory 
degree of approximation unless np is small, say less than 5 for most purposes. 
The number of terms necessary to obtain a computed value with four significant 
figures, for example, may be estimated to be approximately &\/npg/( — g”). 
Expressed as a function of g, jE(l/a:) becomes 


( 10 ) 


E 


1 — g" 


n 


x+ 1 


a series which may be convenient for small values of g. 

E{l/x) ipay be expanded in a power series by Taylor’s Theorem. It may 



BERNOULLIAN VARIATE 


53 


also he expanded in a finite series of expected values of powers, either in E{x), 
' or in S(x — c), E(z — c)®, • • c being any positive constant. The 

•second of these three senes may be obtained by expanding and taking 


1 1 

exiH?cted values, and the third by dividing out - ~ - - - --- and taking cx- 

z c + (x — c) 

I) 0 cU*d values. For all thi-ce expansions, howevei’, the terms become progres¬ 
sively more complicated and laborious to compute. A simpler and more con¬ 
venient series for actual computations may be obtained by expanding 1/x in a 
factorial series. 

4. Expansion of in a series of inverse factorials. It is easy to prove 

by induction that, x > 0, 


( 11 ) 


1 


X 


where 


0! , I! (t — 1)1x1 

X +■ I (x V l)(x +~2) + (x+i)r‘ 


{t - 1)1x1 
'(x4- tjf 


+ /f<(x) 


( 12 ) 


if,(x) = t!(x - 1)1/(x + t)I 


i.s the remainder after the first L terms. This is, of course, an expansion in 
Beta functions. It is altxj a simple special case of the expansion of a function 
in a "faculty wwies" or scries of inverse factorials [3] with an exact expression 
for the remainder. 

I/Ct 


(13) 8, - S 1 

/n -h i' 


= (i~T. 

/n X 

+ i. 

/ i — q" 

1 

H 

o* 

\ » , 

Then, since 





(W) 

j: 

l! / r 

(x+ mV 


-9") 
i )! p‘ 



the expected value of (11) is 


(15) 



Ols, , _ II 8a 

(n -f 1 )p (ii + l)(n + 2)p* 


(i — l)lnl3.- 
(n + f)’ I p* 




+ Xli, (x)P(x). 


When (Icvcloped as infinite series, both (11) and (16) are convergent since the 
remainders Ai(x) —> 0 as / w. 

For computing purposes it is convenient to write 


+ W(:*)) 


( 16 ) 

in which, since 
(17) 




54 


FKEDERICK F. STEPHAN 


the following recursion relation exists between u, and u ,-1 

- (^' ~ _ (i — D a.- t — k/z ,• > 1- 

(n + i)!?*" (n + f)p 

j - k 
(n + 1)71 

where 

(19) k = aPGfVCl - q”) - !)■ 

This reduces the computing of the w. to a simple reiietitive procedure. The 
computing is still simpler in those problems in which, for the degree of precision 
desired, k ls negligible. 

An estimate of E{Ri{x)) .should be added to the sum in ( 10 > to improve the 
approximation. To determine a suitable estimate, a loWer hound for the ex¬ 
pected value of the remainders may he computed from one of the following 
inequalities: 


EiR^ix)) = 2 A Fix) 


( 20 ) 


X (* + t) 
\m 


m* 


> ^ - 4 Kt 

m Tti 


+ (*- 


K .... 

(i + i) 

l)U,_i + iu,, 

tn 


: - m)»\ 
•m*x / 




m 0 


which is maximized by setting m = {(d — l)n(_i — (Ui |/ui, whence 


( 21 ) 


E{Ri{x)) > iu]/\{l — l)u/_j — lu, 


I > 1 . 


Also, since when m = E{x) 

( 22 ) 2 (x - m) m < 2 (x - m)P{x) - 0, 

a simpler inequality is 

(23) E{Ri{x)) > dU((l -- q'')/np, 

Further, if only the first c < n terms in (20) are taken, 

(24) 
where 

(25) 




Wl = 


k 


and 


(d -t- l)q 

An upper bound may be computed from 


.. - 1)(« - ® + 1)P .. 


(26) 


E{R,(x)) < 


tUt 

( 20 . 1 ) 

2 iu» + 2 

(26.2) 

1 . 2 1 


(26.3) 

1 1—1 .iV 

J x -1 \x Jj 

m.j) 



DKUNOUIjUAN vaiuate 


55’ 


the choice among whicli may Vie governed by computing convenience. Taken 
with (Ifi), the.^e lueciualitieH provide lower and upper liound.s for li{l/x). 

6. Examples. Two exam plea will serve to ilhastrate, the factorial .serie.s (15). 


ExAMI'I/E 1 

('tnnpnlnlion (if EQ/x) for n = 1(K) and p ~ 0.1 


up ==10 fc >«= ,()0{),2()5,()21 E{1) = .111,.527 



Hi no mini 


Factorial 



sunt of t 

.S'UOT of 1 

series lower 

Upper 

l 

terms 

terms 

bounds* 

hound** 

1 

.CXK).2fl5 

.098,984 

.099,()47 

.1,32,1(57 

2 

.(X) 1,107 

.108,075 

.109,000 (.111,034) 

.115,247 

3 

.(H)3,07I 

.110,548 

.110,752 (.111,313) 

.112,498 

4 

.007,039 

.111,082 

.111,223 (.111,381) 

.111,852 

5 

.013,813 

• 111 ,280 

.111,385 (.111,452) 

.111 ,0.57 

(i 

.023,743 

.111,370 

.111,452 (.111,478) 

■ 111,.587 

7 

.03(5,442 

.111,410 

.111,483 ( 111,489) 

.111,5.50 

8 

.0.50,790 

.111,444 

.111,.500 (.111,497) 

.111,.544 

9 

.005,287 

.111,401 

.411,509 ( 111,.503) 

.111,.537 

10 

.078,474 

.111,472 

.111,514 (.111,,508) 

111,.534 

11 

.(189,372 

•111,481 

.111,518 0111..511) 

.111,.532 

12 

.()97,(X)4 

.111,487 

.111,520 

.111,530 

13 

.103,.320 

.111,492 

.111,521 

111,529 

14 

.100,985 

.111,495 

.111,523 

.111,529 

15 

.109,1(14 

.111,498 

.111,524 

.111,520 

10 

.110,309 

•111,501 

• 111,.524 

■ 111,.528 

17 

.110,992 

.111,503 

.111,.525 

.111,528 

18 

.111.291 

.111,505 

.111,525,4 

.111,527,5 

19 

.111,431 

.111,500 

.111,525,0 

.111,.527,3 

20 

,111,489 

.111,508 

.111,525,8 

.111,527,1 


24 .111,52(1 


100 .111, 527 (end of sericH) 

*Sum of t terma pluH lower bound for K{Rix)) from (2*1) with c =<= 3. Mum- 
iM'm in parentheHCH an* calculated from (21). 

**Hum of / temiH plua upiier liound on from (20.3). 


1 
1 

2 


Example 2 

Computalim of E(lfx) for n = KXX) and p 0.3 
np *= 300 A; » 9.7 X 10"'^ 

Sum of I lerms Factorial series upper and lower hounds* 


.003,330,003,330 

.003,341,081,185 


/.003,340,7 

\.003,341,0 (.003,341,155,4) 


* Computed aa in Example 1. 



66 


FREDERICK T. STERHAN 


I 

3 

4 


Sum of I terms 
.003,341,154,817 


Factorial series upper and lower bounds* 
f.003,341,211 
1.003,341,155 


.003,341,155,549 


.003,341,156,29 

.003,341,156,56 


.003,341,155,559 


.003,341,155,58 

.003,341,166,57 


For the binomial series, the sum of the largest eight terms of (9), not the 
first eight terms, is approximately .0007 which is less than 1/4 of the 
value of E(l/x). 


lu'the first example the value of np is almost small enough to make computation 
by (9) convenient. In the second example about 120 terms of (9) must be com¬ 
puted to obtain an approximation to four significant figures but only four teriM 
of the factorial aeries are needed to obtain seven significant figures. It is evi¬ 
dent that as np increases, the number of terras of (16) required to obtain an 
approximation to a given number of significant figures decreases. 'Fhe opposite 
is true of (9) as n increases, or as p approaches a value near 1/2. 


6. Extending the special condition. In some sampling problems all values 
of a: less than a specified value, g, and greater than another specified value, h, 
are inadmissible. Then the probability of x in n is 


(!?I) 


9 <x<h, 

where 



(28) 

^O.d.fc 

= t (”) 



\X/ 

With this new condition, E{l/x) 

is given by (16) if Si is replaced 

(29) 


v* /« +p g 
*-» \* "b */ ttO.iA 


and the summation in the remainder term is from g to h. Also since 




IlKHKOt'IiMAff VARlATi: 


57 


a recumoD wlalicjii mmilar t(i (IH) may bp used in computing 

(31) 


(i - l)inUf,,.), 

Uf.p.i “ ' ,1 ,-t" 


(7! 4- i) ! p‘ 

a - l)n. “ (t ~ D! {kj(g -f i - 1)! + h,^/(h + i)l 

(n '+• t)p ' 


where 


(32) 

(33) 


K p 


ai* 


nip's" 
(n ~ gj i 


kk “ 


I ^ 

n! p 


s" *" 


(71 “ A) I 8s., A * 


The inwiualitiw (2Jl) to tKli induaive and (2fi) are applicable U) this extcnBion 
on subRtitution of Ui,,.}, for u ,. 


7. Expansion of ") in a factorial series, hkiuation (11) may l>e extended 
to other negative inf4*gral powew of x. If a i« a positive integer 


(34) 


where 

(36) 


f;(x ") zlpii) -- 

X * 


. 1 . . 

{Trd* l}p (tt + l)(n d* 3)p* 


-f* •••■+■ 


hi.iflin! 

(n+Olp' 


+ S/e; iz)Piz) 


li'i (x) 


r io.../ 

f«*i 


ad ‘xlP(x) 
(x + f))x* 


and the h,., a« the nbtwiute values of the vStirling nurabeni of the first kind («*« 
Table 2) formed by the rtwmion relation 


(36) hij « hi ut 1 + C* 1)6., n.,, h,./ “0 if j > % or ; < 1. 


It is evident that 

(37) g b(., « a 

(38) b,.i « a ^ 1)1 and < t! if j > 1, 


whence 


/ikl) “ ^ ^ J PO) 

< 2(x-bt)r 


< 


1 

(t+1) 


P(z). 


(39) 


* > 1 



58 


FREDERICIC F. RTEPHAN 


Hence R[{x) — > 0 and E(Ji{{x)) ~*0 and the wim nf the I feriiiK «f 

(34) converges to Eix'") as t —> 

The following recursion relation cortt'sponcling to (18) providw a .winiple prfJrt'- 
dure for computing; 


(40) 


W»,a — 1)1*^ h{. 


(m_. A.«/bi i,J -- k<tl 
(n + [)f 


The computing procedure, then, followa a cycle of four simple ftiH'rnti(m»; 

1. Divide {k/{i — 1)1] by i. 

2. Subtract the quotient from {ui-i.bA, 

3. Divide the difference by |(n + f 4- Dpi + p. The qutiticiit is «,.« h,.«. 

4. Multiply thia quotient by ft,,* . 


TABLE 2 


Atsoiute toi'ucs of SUrling numbers of ihcfml kiwi, h,.* 


\. 








1 

2 

3 

^ i 

5 

6 

' \ 






( 

1 

1 

0 

0 

0 ' 

0 ' 

‘ 0 

2 

1 

1 

0 

0 i 

0 ! 


3 

2 

3 

1 

0 i 

0 

1 0 

4 

6 

11 

6 

1 

0 i 

0 

6 

24 

60 

36 

10 

1 i 

1 0 

() 

12Q 

274 

226 

85 1 

16 ; 

1 1 

7 

720 

1,764 

1,024 

735 1 

176 1 

21 

8 

5,040 

13,068 

■ 13,132 

6.769 1 

l.tHK) ' 

322 

9 

40,320 

109,584 

118,124 

67,284 1 

22,449 ■ 

4,636 

10 

362,880 

1,026,576 1 

1,172,700 

723,680 

209,325 ‘ 

63.273 


These numbers are also known as differenUBd, coefficients of zero {4]. 


The expressions in braces are quantities obtained in the precedmg cycle. 

The may also be calculated from (18), or checked by such a calculation, 

A lower bound for E(R {x)) after t terms may be calculated frtsm the firat c 
terms of 


m'it)) = L R\{x)P{x) > ± R’,ix)P(x) 

(41) 

,,i + ifj (n”- z) t (f- 5*) 

or from an inequality similar to (23) 


(42) 


EiR'ix)) > 


- '“« y 

{t — 1)1 M (jS'(a:))‘>-<+i 



llKRNOri,WAN varutp: 


59 


which may alpn irf* written 


EiH'U)) > 


(45) 


(t- 1) 




{E{x) + i){Eix) + i - 1) • • • E(x) 


I n 

£ h,t.iiE(x)y 


An npjK-r hnniui uvay he caIculftU’,d from 


(44) 
or 

(45) 


EiR'ix)) < < tit + l)u, 


it - l)!JrS 


x\P(x) 


Enm) < i:/f'(x)P(x) + i; 


< t + ,, “'f,-, K' 

(I ^ l)i )**1 (r r-l 


'ix)Pix) 


+ 


u, 

(t “ l)Ic' 


,»rl {(C + t)ic + t —])••• C ~ 23 ('l+J.yV 

r ;«»+! J 


8, The poeitive hypergeoinctric variate. The theory of Bumpling without 
n’plucpmcnt fnxm a finite population reats on tho hyrxjrgeoinptric variate. Its 
pmhaliilify fimcfion is 

In Hpplications t<i finite ftampling, N ia the number of items in the population, 
M ia the number of them tluit arc of a certain kind, n is the number of items 
drawn for the sample, and x is the number of items of the designated kind in the 
sample. 

Ah in the case of the Bcmoullian variate, it is necessary to exclude zero in 
defining the expticted value of 1/x. The probability function of the positive 
hyf>ergeometric variate, then, is 

(47) PAx) « Pix 1 N, M, n)/«o , x > 0 


where 


(48) H ^ \ - PiO\N, M, n). 

Throughout this Hcction the notation will have reference to (47) instead of (I). 
The exfwcted values of positive integral rxxwors of * are 


(49) 

(50) 


A'(x) “ Mn/iNst) 


i?(x*) 


ifMiM 


l)n(a - 1) 


NiN - 1) 


+f} 



60 


yBEDEMCK T. STEPHAN 


and, in general, 


E{x ‘)« z BiE(x\/{x - m 


where the ®{ are the Stirling numbers of the f?ePond kind and 

(5^; ^ \{x - j) ij (M - j )! (n - 3) t .V! So ■ 

The factorial series corresponding to (Ifi) is 


where 


i; Q = 2 J P»(J>=) « «. + 




(55) EiR,(x)) = S 

The Ui may be computed from 
_ (A^ + l)st 

" (M + l)(n + l)so 

^ 1 r_ N + 1 __~ 

7o \{M + 1) (n + 1) m(.N - M ~ n - 1)\'{11+' i){n + 1) 

and the recursion relation 


{N + i)8i 

(M + t)(n + i) 8 c-/'-’ 


where 


«< = 1 — E P{^ I iV + i, Af + i, ft -f i). 


The computing is quite simple in those instances in which 1 - st is negligible. 

Corresponding to (26), an upper bound for the expected value of the re¬ 
mainders after i terms may be computed from 


ilut + iPB(l)/(t + 1) 

W(x))<Utu, 


p«m 

3 ( + 1 6 (f -f l)(f + 2) 


(59.1) 

(59.2) 

(59.3) 





IlERNOULtlAN VARIATE 


61 


A lower bound for the expected value of the remainders may bo computed 
from one of the following meciuatitiea corresponding to (23), (21) and (24) 

(601 K{Rdi)] > tu,.Vso/(^Vn) 

(61) EiRiix)) > lul/Ut - l)u,.i - iu,l 

(62) B[m) > t JC ?«(*>■ 

The exi^ecUnl valutia (tf other negative integral powers of the positive hyper- 
geometric variate may be calculated from 

(63) E{z^*) « t huu<fii -- 1)1 + EiUlix)) 
where 

(04) 

With i’fl(x) aubatitiited for /T.t), (39), (42), (43), (44), and (45) provide lower 
and upper lioumls for RiR't{x]) for the positive hypergeometne variate. Also, 
corresponding to (41) 

(05) £(k;w) > t «:mp.(*). 

9, Variance and moments of l/x and a;-*. The variance of 1/x, which is 
E{l/x^) - may be calculated from (10) and (34), with a == 2, for the 

positive Bcrnoullian variate, and from (63) and (63), with n - 2, for the positive 
hypergeometrio variate. Likewise, the variance of x"* and the moments of 
l/x and X “ about Jffl/x) may he computed by the usual formulae. 

IlEFEHENCIiS 

(1) G. BonutA.vK,* "Formulierung end begrQndung xweier hilfflflSUe der matheinatiacho 

SUUatik," Math. Annalm, 74(1913), 341-409, 

(2) E. T. WniTTAKisa and G. Robinso.v, The Calculut of Obsermlions, London (Second 

Ed.) 1937, p. 368, 

[31 E, T. Wbittaker and 0. N. Watson, Cambridge (Fourth Ed.) 1927, 

p. lit. 

Kl Ckaroibb Jordan, Calculus of FinUe Differences, Budafjeet, 1937. 


’The writer is indebted to Dr. Felix Bernatcin for the reference to Bohlman. 



RANDOM WALK IN THE PRESENCE OF ABSORBING BARRIERS 


M. Kac 

Cornell Universily 

1. Introduction. The problem of random ■walk (along a straight line) in the 
presence of absorbing barriers can be stated as follows: 

A particle, starting at the origin, moves in such a way that its distilaceraents 
in consecutive time intervals, each of duration di, can be represented by inde¬ 
pendent random variables 

Moreover, if at some time the total (cumulative) displacement becomes >p 
ip >0) or <. ~ q{q'> 0) the particle gets absorbed. The problem is to deter¬ 
mine the probability that "the length of life" of the particle is greater than a 
given number L This problem also admite on interpretation in terms of a game 
of chance in -which the player quits when he loses more than q or wins more than 
p. An interesting paper on this type of problem by A. Wald* appeared recently 
in the Annola. Wald assumes that the X’s are identically diatrihuted and that 
their mean and standard deviation are different from ().’ He i« then mostly 
interested in the limit'mg case when both the mean and the standard deviation 
become small. The object of this pa{)er is to propose a different method of 
attack ■which in some cases leads to an answer in closed form. The mothod we 
use has been employed repeatedly in statistical mechanics in the study of the 
so called order-disorder problem. It is due, I believe, to E, W, Montroll’. As 
far as the author knows this method was never used in connection with the 
classical probability theory and this seems to furnish an additional reason for 
publishing this paper. 

2. The simplest discrete case. We assume that each X is capable of assuming 
the values I and —1 each with probability and for simplicity sake we let 
At = 1. Note that, unlike in Wald’s case, the mean of X is 0, Denote by N 
the random variable which represents the "length of life” of the particle and 
let (m an integer) 


m « I or 
0 otherwise. 


m = - 1 , 


> A. Wald "On cumulative Bums of random variables,” Annola of Malh. Slat., Vol, 16 
(1944), pp, 283-296. 

* Since this was written Professor Wald informed the author that ho con easily avoid the 
condition, that the mean should be zero, 

' See for instance E W. Montroll, "Statistical Mechanics of nearest neighbor systems," 
Jour, of Chem. Physics, Vol. 9 (1941), pp 706-721, 

62 



KAN0OM WALK 


63 


Clearly we have (throughout this section we assume that both p and q are 
mtegerK) 

Proh. |*V > ni « Proh. | -(/ < A', <p,~q< A'l + Xi < p, ■ ■ ■ , -q 

< A'j -j- ‘• 4- A'i, < pj ~ i;5(7ni)5(ma) 6(ffln), 

where the summatiori, is extended over all integers mi, ma, • ■ • m„ for which 
"5 :< »h < Pi “9 < nil + mj < p, • • ■ , —g < mi 4 ma -f- ■ ■ • + ffl„ < p. 

Isjtting 


=“ ? + mi 4 • • • 4 mj, (j = 1,2, • • ■ , n), 

we see that 

« 

( 1 ) Prob (AT > n! * 23 5(h - q)5{k - h) • ■ ■ 5(J„ - U-,). 

/(.■■•.In—8 

Ixjt US now consider the (p 4 i? 4 1 ) by (p 4 5 4 1 ) matrix 

'0 I 0 0 0 •. 

I 0 i 0 0 •• 

( 2 ) A « mi ~ k))) « 0 ^ 0 ^ 0 • • 


It is easily seen that the sum in (1) is criual to the sum of the elements in the 
iq 4 l)-8t column (or row) of the matrix A". Thus 
Prob, {('/ > nj sum of the olcraente of the {q 4 l)'8t column of A". 
Denote by Xi, Aa, • • • X,i 4 «+i the eigenvalues of the matrix A and let 

(ll'\ , x'/ij+i) 


be the normalised eigenvector of A belonging to the eigenvalue Xj. It can be 
shown by elementary means^ that 



* M&tricea of type (2) have been introduced and studied ia vatioua conneclioiia. In a 
paper by R. P. Boa« and the present author recently accepted by the Duke Malhemalkal 
JourwU reforoneea to aeveral authont are Rivon. In order to find the einenvaluee and the 
eigenvoctora of ( 2 ) it auffioca to know that 

1 o 0 •■• 
a 1 o •• 

0 o I a 
0 0 « 1 


where m is the order of the matrix sh and « roots of the equation p* ~ 4 - a» « 0 , 














64 


M. KAC 


and 




V 2 . ’Tjfc 
^ -am- — 


Vp + ?+ 2 p + g + 2 

Denoting by R tbe orthogonal matrix 




• ' ■ Xp+^^ 1 


Xj*’ 

• • Xj>-fg4<V 

^(p+,+i) 

^<p+<l+l) 

(p+»i-i) 

®p+«+i 


and by R' the transposed of R we have (since the eigenvalues of -I are simple) 
by a well known theorem 

'xr oi 

X? 


A" = R' 


R. 


,0 Xj+»+iJ 

It thus follows by an easy computation that the sum of the clementw of the 
(g + l)'8t column (row) of A" is 


p+j+i 




r—I (•-I 


P+j+1 /p+^+l \ 


We have 

p+a+X 

t 4^’ 

T-i 


V 2 


p+i+i 


sm 


rjr 


V^+7+1 P + S + 2 

' 0 , 

V2 


cot 


rj 


[ Vp + ? + 2 2(p + g + 2) ’ 


} even, 
iodd, 


and therefore* 


Prob. {Af > u) 


PW 


"3 Bin ^'(g + 1) „nt ___ 

p + g + 2 ^=1 p + g + 2®“p + g + 2°'^ 2(p + g + 2)’ 


cos 


where the star on the summation sign indicates that only oddj'a are taken under 
account. 

The method just illustrated is quite general but in more complicated cases 
the job of finding the eigenvalues and eigenvectors becomes formidable. 


‘ ProfeBBor Feller has called the author’s attention to the tact that similar problems and 
formulae can be found in Chapter III of W. Burnside’s Theory o/ Probabililv (Cambridge, 
1928). He also pointed out that the problem could be treated by moans of Markoff chains. 




RANDOM WALK 


65 


I’rofeKKor G. K. Uhlenbeck liaa pointed out that our formula implies a known 
result from the tht'ory of llrownian motion. 

(’onsider a free Brownian particle which at i = 0 is at x = xoixo >0). K. 
Fiirth® has .'»hown that the probability that between I and i + dl the particle 
will be either at x “ 0 or at x == d (0 < Xo < d) for the first tune, is Riven by the 
formula 


f: (2m + .sin 

<P m-O d 


where 1> i.s the “eoefliiaent of diffusion ” 

If we treat the om‘-dimensional Brownian motion lus a random walk with steps 
zkAx, each move lastinR At, the, probability that a particle starting from xo will 
not have reached 0 or d in the time interval (0, t) can be calculated by means 
of our formula. 

We must only put q ~ xo/Ax, p =• {d — xo)/Ax, n = t/At and assume that as 
both Ax and At api)rna('h 0 the ratio (Ax)V2At approache.s the "coefficient of 
diffusion*' D. 

An elementary computation shows that in thus limit the Prob. [N > l/Al\ 
approaches 


irpTi 1 d 


and (liat the differential of thi.s expreafiion (with a iniiuus .sign) gives exactly 
Ftirth’s expression. 


3. General theory in. the continuous case. Wc now nssumo that the distribu¬ 
tion function of X poKscsacs a eontinnoua and even density function p{x), We 
have 

Prob. (W > nj — j ••• j p(xi) • • > p(x„) dxi • • ■ dx„, 
a 

■where the region of integration U is defined by the ineiiualitics 

-g < a:s < p, —g < xi + Xi < p, ' • ■ , —g < xi -p • - • + x„ < p 

Introducing the new variables 

1 /j - g -P Xi 4- • - • + xy, (i = 1, 2, ■ ■ ■ , n), 
we see that the Jacobian of the transformation i.s 1 and 


(3) 


Prob. (W > n) 



pivt ~ <i)p(.yi - Vi) • “ p(i/» “ i/n-j) ‘ • 


Consider the .symmetric, integral equation 


(4) 



p(s - dt = X/(s) 


•/Ina, (/. P%s. 53 (1017) p 177. 



C6 


M. KA.C 


aad aote that if Knis, t) denotes the n-th iterated ktirnel of thw integral etiuation, 
the right side of (3) js equal to 

r'" Iu{q, 1) dl. 
h 

Thug 

Prob. (fV > n) = ^ K„{q, £) dt. 

From the general theory of integral equations we know that 


K.(s, £) = E X,7/(s)/y(£). (n > 2), 

/-I 

whore Xi, Xj, • ■ • are eigenvalues and /i(£), • • * normalized eigenfunctions 

of the integral equation (4). 

Since p was assumed to be continuous it follows that the eigenfunctions are 
continuous and 

Prob. [N > 7i) = Ex7i(<?) f fiit) di. 

;-l Jo 

This formula is very general and provides, in a sense, a complete solution of the 
problem in the continuous and symmetric case. Unfortunately the usefulness 
of this formula is limited by the difficulties encountered in solving integral 
equations of the type (4), 

In fact, the integral equation 

to which one is led by considering the normally distributed X'n, apiHsars to be 
very difficult to solve. 


4. A particulai case. If we assume 

P(x) = 

we are led to the integral equation 

I dl = 2X/(s).’ 

0 

which is quite easy to solve. 

In tact, rewriting (5) in the form 

I’P-ht 

(d) e- j e'f(0 dt + e’ dt = 2X/(8) 

JQ Jq 


^ I h»,ve recently encountered the integral equation (S) in solving an entirely different 
problem, A complete discussion can be found in a restricted N D.R.C. Report 14-305. 



IIANDOM WALK 


67 


and differentiating twice with respect to s we obtain tlic differential equation 

/"(«) + - iVw = 0. 

bubstituting the general solution of this equation in (6) wc find in an entirely 
elementary fashion that 

h “ 


fM 


i + y;' 

- + yj co s _ 

Vi'TI(p+1k1+75)’ 


where i// is the jth (positive) root of the transcendental equation 


(7) 

We have 


tan (p + q)y = 

1 - t 


I (ain y/t + y/ cos 2//0 ~ U - cos (p + q)yj + sin (p + q)yj] 


Vi 

and it is easily seen that (7) implies 


1 - COB (p + i)yi + Vi sin (p + q)y, 


0 if cos (p + ?)yy = 

- 


Finally, 


Prob. (iV > n) = 2^' 


2 if cos (p + q)yj = - 


sin y^q + i// cos y, q 


1 + 2/j 


2 • 


(1 + J/?)” Vj |l + i(p + q){l + y?) j ' 

where the dash on the summation sign indicates that only those j’s arc taken 
under account for which 


cos (p + q)yj= 

1 + V, 

We omit here the discussion of various limiting cases inosmueh as our main 
purpose was to obtain exact formulas. 

There are indications that some of Use limiting cases ore related to singular 
integral equations with continuous spectra. Wc may return to this subject 
at a later date. 



ON THE CLASSIFICATION OF OBSERVATION DATA 
INTO DISTINCT GROUPS 

By R. V, M 18 E 8 
Harvard University 

Introductioa. In scholastic examinations aa well as in the examination of 
industrial products the following probability problem arises. The individuals 
of a certain population are successively subjected to trials each of which lends 
to a definite score z (one real number or a group of m real numbers). Each 
individual is supposed to belong to one of n classes. Tliesc classes are character¬ 
ized by n probability densities pi(z), pi(x), ■ ■ ■ p„(x). One has to decide on 
the basis of the observed value x to which class the respiective individual belongs 
and one wishes to make this decision with the smallest possible risk of failure. 

For example, let us consider an examination where the tliree grarles A, R, C 
are attributed on the basis of a simple score z (case m = I, n = 3), It may be 
assumed that an individual of the class A has a mean expected value of x equal 
to = 75 and a normal distribution with the standard deviation ci *= 4/^/2. 
The analogous values for the classes B and C may be iJj 5 = 50, aj = 8/-\/2 and 

= 25, 0-3 = 12/v^. In this case, the solution developed in the present paper 
allows the conclusion that the best way of grading would be to attribute the 
grade A to scores z beyond 70,0, the grade C to scores below 40.0 and B to the 
rest. The corresponding error risk \vill be 3.9% or the succcfw rate 0.901, 

There exists, of course, one case where the solution is trivial, If the probability 
densities p,(x) are limited to n non-overlapping regions R, (with p, = 0 at points 
outside R,) an obvious decision can be made without any risk of failure. An 
assumption of this kind underlies the usual procedure of grading. If, in the 
foregoing example, an individual of class A is supposed to have at any rate a score 
beyond 60 and a class C individual less than 40, it is obvious how the grades 
should be attributed without incurring any risk. It seems, however, that in 
many problems the assumption of normal distributions or some other kind of 
overlapping distributions is more appropriate. Then, the probability problem 
has to be solved. 

The solution submitted in the present paper is derived from the simplest 
principles of calculus of probability without any arbitrary assumption or hyjK)the¬ 
sis. ^ If n equals 2, the problem can also be considered as a problem of testing 
a simple statistical hypothesis with a two-valued parameter.* It has been 
shown in an earlier paper that under this restriction success rates higher than 
50% ate obtainable. 


See A. Wald AnnaZd 0 / Malh. Slai., Vol. 16 (1944), p. 146. Here, both p^ (*) and p, (*) 
ate Bupposed to normal distributions with the same covariance matrix. The problem 
treated by Wald is different from the one considered in the present paper since in Wald’s 

pa^r the parameters of the two multivariate normal distributions are asBumed to be 
unknown. 

’■R. V. Mises, Annals of Math Slat., Vol 14 (1943), p, 238. 

68 



f'LAftSIFIfJATION OF OBSERVATION DATA 


60 


1. Statement of the problem. For eacli of n classes of individuals a prob¬ 
ability density pr(x), v ~ 1,2, ■ ■ ■ n, is given. We subdivide the m-dimensional 
a:-space into n regions Ri, Rj, • • • and assign the region if2, to the rth class. 
The probability, for an individual of this class, to have its x-value falling in 
R, is 

(1) P,= f p,ix) dXy V ^ 1,2, n 

“'(a.) 

where dX denotes the dement of the j-spacc, {dX ~ dx in the case m - 1). 
In the N first trials of the indefinite sequence of trials, W, individuals that 
belong to the sth class will he tested. Out of these only those individuals whoso 
j-value falls in R, will he oseribed to the rth class. Their number according 
to the definition of probability, equals N,{P, + «,) where e, tends to^vard 8 zero 
as N, goe^ to infinity. The total number of correct decisions during the N first 
trials is therefore 

(2) N,{P, + e.) + N,(P, + • N„{P„ -t- ^„) 

and the relative numlier is 

(20 (P, 4- o) +^HP, + ^) + (/>„ + *n). 

If N increases indefinitely a part of the N, must become infinite. For thejjc 
classes, «, converges toward zero. For the other classes N,/N diminishes to 
zero. Thus, the relative number of right decisions converges towards 

(3) -^iNa\ + N^Pz + NJ\). 

The N, are unknown. I'lvcry one of these unknowns can take each value from 
zero to N. If P^ is the smallest P, , the most unfavorable case, whe,re the 
expression (3) has its smallest value, will occur with iV,, = N, all other N, being 
zero. This value is obviously . Thus it is seen that the frequency of correct 
assignments is at least equal to the smallest P, which may be written as Fmin , 
The greatest risk of making a false decision is 1 — . 

Now the problem to be solved in the present paper can be staled us follmvs: 
For n fftvm densities p,(,x), find the. subdivision of the x~s]mc into n ruffians R, 
that ffives to the smallest of iiic expressions P, defined in ( 1 ) its possibly greaksl 
value. 

This problem has the type of a continuous variation problem with the integrals 
in question bounded within the limits zero to one. We may, therefore, assume 
that under reasonable restrictions for p,(x) a solution exists. Uniqueness of 
the solution cannot be expected in general. It seems very difficult to establish 
the conditions for unicity in other than the most simple cases. Existence of 
more than one solution would mean that each of them is an optimum with 
respect to infinitesimal modifications of the boundaries. 



70 


H. V. MieES 


2. General solution. A simple problem of variation i.s conaideretl aij solved 
in principle when the nature of the extremals is known, In onr case of a so- 
called minimax problem, where the minimum of n quantities is maximized, an 
additional relation between the n integrals is rerjuired. Both can easily be 
found in the actual case. 

Let us first consider a partition of tho x-space into n regions with not all P, 
being equal. The smallest P, will be called Pmin and the smallest hut one P*. 
Among the k regions for which P, = Pmin there will be at least one, say, P* that 
has a common border with a region whose P-value is greater, so that Pg 
P*. Now modify the boundary between and Rg in such a way that the space 
covered by Pa is increased and that of Rg decreased. According to (I) the new 
values of Pa and Pg will be 

(4) Pi =P.+ A, P'g^Pg- A' 

with both A and A' positive The two quantities A and A' arc not independent 
of one another, hut they can be chosen both smaller than any given positive 
number t. Therefore, the condition 

(5) p: = Pa + A < - A' = p; 

can be fulfilled. All other P,-values remain unchanged. 

In the case fc = 1, that is, if only one region R, had originally the minimum 
P-value, the modified system has a greater minimum P, which equals either 
Pa + A or P*. If fc > 1 the new system has the same minimum P as the original 
one, but its fc-value is diminished by one. If we repeat'the same procedure 
(fc — 1) times we obtain a system of regions with one single P, having the mini¬ 
mum P-value and the next step leads to a partition of the i-spaoe into n regions 
with a smallest P-value that is greater than the original P„ia. Thus it Is seen 
that no partition with unequal P,“valuw can solve our problem. 

Secondly, if m > 1, consider a system of n regions with P = Pi »= P* » • • • « 
Pn . Take two points, x and y, on the border of any two neighboring regions 
R, and P„. An infinitesimal variation of the boundary would consist of adding 
to Ry in the neighborhood of the point a; a space element 5(S subtracting it from 
Rii and, at the same time, adding to m the vicinity of y an element SS' sub¬ 
tracting it from R,. Then, according to (1), the new values of P, and P,, will be 

(») P; = P + p.{x)6S - py{v)SS' 

pI = P — p^{x)ss -f- p^(y)dS'. 

Introducing A, = P( — P and A,, = — P, these equations solved for BS and 

SS' give 

(7) 85 = + PM(y)A> ^ p,(x)Am + pg(x)A, 

D ’ ~ n 


where 

(70 


B = p,(x)p^(y) - p^(x)p,(y). 



CLAKSIFICATIOf? OF OBSERVATION DATA 


71 


If Uu> cU‘t(’rmiiiaDt J) is pr^itivt*, we find two iiositivc tiuantities 6 .S' and S.S" 
for any i)air of positive and A,. If D is negative the ,sanu‘ is true wlien j and 
7 / are intorehanged. In liotli eases, that is, with ]) ^ 0, tlie original partition is 
replaeed hy a new system of regions in whieh only two regions, li, and , have 
inerpa,sed /‘-values, while (if n > 2) still /*,„!„ = 1\ If to this sj'stein the pro- 
eeduie as deserihed in the foregoing is applied, a final partilinn with a greater 
minimum value of P ean he derived. The conclusion Ls that no .solution of our 
problem eau Include a boundary on whieh the determinant D is diflenml from 
z(TO for any two points jr and y. On the other hand, it is .seen that 1) ~ D inearis 
that the ratio p,U)‘.pJx) has a constant value along the border. Thus the 
result is reached: 

7'hc partition of the x-xpacc that soIvck our prohlrni is characterized by two proper-^ 
tics: ( 1 ) for all n rrytons R, the value of V, is the same; ( 2 ) along the harder helwern 
R, and R^ the. ratio p,M/pJx) is conslanl. 

In the one-dimen.sional ca,se (m = I) only the first of these two statements i.s 
relevant. In any case, the success rate, that is, the guaranteed ratio of correct 
deedsions, equals tlie common value of all P,. 

3. Illustrations, (a) One-climen.sional case. Upon intrmlueing the eutnula- 
tivc distribution functions 

( 8 ) F,ix) = r pr(z)dz 
the condilkma Pi /’j = • - ■ P„ take the form 

(9) F,(x,) -■« h\(xj) ~ fi(xi) - .. • = F„.iCx„ ,) - F„ - I - FM.. d 

where Si, Tj , • • • x„_i determine the n intervals on Hie. hoth-sides infinite x-axis- 
If all density function.s have the .same form except for an wfline. transformation' 
one has 

(10) F,{x) == F[/ir(x - I?,)], e = 1 , 2, • • ■ a 

Let ua assume, for instance, that aeoros between 0 and 100 are attributed to 
three types of individuals. The first type, may have an even chance to olitnin a 
score between 0 and 50, the second Iietwccn 40 and 80 and the third between 
70 and 1 (X), Here 


( 11 ) 

with d, = 
( 12 ) 


F,(x) « i + (x “ 0,)p,, I .r - i5,1 g 
25, GO, 85 and p, « is, A, A- 'I’lie conditions (0) supply 

1 + £L" == _L te ^ X.) = " - 

2 ^ 50 40 ^ 2 30 


and this, solved for Xi, xi gives xj — 41 f, Xi = 75 while the three expressions 
( 12 ) take the value 0.833. Therefore, in attributing all scores below 411 to the 
first class and all scores beyond 75 to the. third one is safe to make under no 
circumstances more than i incorrect deeisions in the long run. 



72 


II. V. M13B8 


In the example quoted in the introduction one Inia 


(13) 






with«?, = 75, 50, 25 and cr] = 8, 32, 72. If 4 *(j:) denotes the integral 


the conditions (9) become 



The first and last expression equated lead to Xi + Sxj = 250. Tin*, complete 
solution can be found with the help of tables for 4>. It i.s = 29.9920, Xi = 
70 0027 with the common value twice 0.9G1 for the three oxprt’.s.sionH (14). 
Hence the result as quoted in the introduction. 

Let us now take up the case of six normal distributions with equidistant 
mean values t? = ia, 4r3a, ±5a and one and the same variance Then, 
because of symmetry, two equations only have to be fulfilled: 


1 -h 


^/a^i + Sah ,/a:i + 3o\ ./ « \ + o\ 

*(-^'j - *(-;v?") - '•■(‘7V2-) " ■"Lva) “ ’\'.V27 


For = 0.32, the numerical solution gives 


Xi = -4.100a, . 1:2 = — 2.0()2ti, 


The success rate, i.e. half the common value of the above expressions is 0.931* 
The sbt intervals extend from — m to , from xi to Xi , from X 2 to 0, from 0 to 
—X 2 , from —X 2 to —xi, and from —Xi to <x>. 

(b) Case of more than one dimension. Let us assume that two classes A and 
B have uniform distributions extending over volume.s = 1/pi and 72 = 1/pi 
respectively. If the two regions have a common part of volume V each surface 
within the common space fullills the condition jn/pa = constant. Thus, the 
two regions J?i and aie not uniciuely determined but .subject to one condition 
only which determines the optimum succe.S6 rate. If k7 is cut out from 7i and 
(1 - /c)7 from 72, the relation must be fulfilled; 

1 - p. 7x = 1 - p2 7(1 - k), i.c. K = - 

'Pi + Pi 

and the success rate is 


iS = 1 - -p,Vk - 1 - 

Pi + P2 


1 - P2 7(1 - x). 


If three classes A, B, and C are considered with the donsitie.s pi = 1 /7i, Pa == 
1/72, Pa = l/Fj and the first two regions have a space of volume 7 in common, 
the latter two a space of volume V, the conditions are 


1 - Pi7(l - x) = 1 _ + XF) = 1 - ps(l - \)V' 



f'liASRinrATION Oy OD.HEIIVATIO.V DATA 


73 


which supply 

^ ^ I_Pi + Pa ____ 

PiPi + ptp» + Pi'Pi~~ y ’ 

I ^ _ P^Vl'El _ YjtJ' 

Pi Pi + PiPj + piPi V' 
and the sucecfts rate has the value 


^5 « 1 - (r + V') 


Pi Pi Pa 


Pi Ps -f Pi Pa + pa Pi * 

If the p, are, normal density functions, say 

pr(.r,y) = 

TT 

Q, == «»(x ~ a,)® + 2^, (i - a»)(!/ - b,) + T» (y h,)’ 


and Pr the correspond uir determinants, tlie curves separatiriR the regions /f, 
are the conic,s 


Qf “ Qit = con.st. 

where the. eonstants are determined by the conditions that all V, must he eqtmb 
If the a, (i, 7 have the same values for every r, the borders fionHi.st of straight 
lines. In this ea.se one can reduce the expressions for p,, by an alline tran.sfurina- 
tion, to 

V,{z,v) = " 

TT 

In the transformed plane the borderline between the regions li, and is ih'v- 
pendicular to the straight line that connects the points A,{a ,, 6,1 and ^l„(a„, 
6„), If all points .d, lie on the same straight line (in particular, if a = 2) the 
whole problem i.s practically identical with the, one-dimcn.sional (»i - 1). In 
the case n = 3, in general, the tluee regions arc confined by tliree line.s per¬ 
pendicular to AiAi, -djAa, .'l.i.'li parsing through a point (■ whose eoortlinatcs 
are determined by the (-([uation.s I’l ^ I’-, ~ I\. If r, (l('tiotcH tht‘ distance 
AfC and v>n sre the angles, /I, C forms with the adjacent sidc.s of tlic triangle 
AiA-iAi one has to use the function 

1 r a 

Fir, va) ~ ^(r — z tan <fi)c ‘ dz. 

Then the two conditions for C read 

yO"! 1 <Pi) + yO'i 1 ~ y(^} > V^) + PCra , i?s) = 1 <p3) + , t^a) 

and the success rate eciuals 0.5 plus the common value of these three expressions. 



ON AN EXTENSION OF THE CONCEPT OF MOMENT WITH APPLICA¬ 
TIONS TO MEASURES OF VARIABILITY, GENERAL 
SIMILARITY, AND OVERLAPPING^ 

Milton da Jsii-va RoDuicniEfi 
Slate Univcrstlij of Sao Paulo 

1, Introduction. Given a frequency distribution D; (X,- , F^] (i » 
1, 2, 3, ' ’ • , n), we shall call the expression 

MriD,X,)=^t{Xt-X,rF, 

l-rl 

the rth total moment of D about the origin X /. We .shall consider th(' weighted 
sum 

where W, denotes the weight corresponding to the particular origin A'j, and the 
summation is over a field ip. In particular, if ip i.s the .set of all valuc.H n.Hauint'd 
in D by the variate Xi , and if Wj = Fy , we shall call the quantity the rth com- 
•phle total moment of D. If, on the contrary, IFy is the frwiuency F', of the value 
AI in a second frequency distribution /)'; [A| , Fj] and p' ia the w‘t of all values 
assumed by the variate Ay in D' , 501, will be called tlu', rtli aggregate rnomoU 
of D and £>'. A modification of this procedure leads to what we shall cull the 
moment of transvarialion of D and D'. 

The consideration of complete moments draws attention to certain previously 
known measures of variability which are independent of tlie origin selected, 
and also provides simple methods of computation which are useful for data 
given in the form of a frequency distribution. The investigation of aggregate 
moments and moments of transvariation gives rise to certain measures of general 
similarity between two distributions, as well as measures of the amount of over¬ 
lapping. 

2. Sliding and complete moments of a frequency distribution, 

2.1, We shall give the name sliding Mai moments of order r to the successive 
values, for particular values of j, of the expression 

(2.11) M,(Ay) = Fyt[(^<-Ay)''Fy]. 

I' — l 


'The Portuguese original of this paper was written in Braril, in August 1043. Its transla¬ 
tion into English was entirely revised by Dr. T. Grevilla, Bureau of the CeneuB, who pro¬ 
posed also many simpliftoations in the derivation of formulae. For his painstaking labor 
and interest I wish to express my very sincere appreciation. I also wish to thank Ur. 
W. Edwards Demmg for reading the manuscript and making several valuable suggestions. 

74 



t'KTKN'KIDN CtP MOMKNT CONCEPT 


The (‘Xjtu'xsimi for tin* comiilelc totiil moment, written out in full, in 
(2.12) - E .V, (AM = 21 : [(AT, - AM' f\ /-• ]. 

)-t i*! j—I 

It is readily seen that the complete moment is indeyK’ndent of the choice of 
orig;iri. 

2.2. If r 0, we have 

A/, (AM ^F,±F,. 

I-} 

Tlio complete total mctment of order zero will therefore lie 

(2.21) m, = i: F/ 2 : = Ml 

)«•! <-1 

where A/a Ktanil.s for the total moment of order zero about the origin of the A’', 
that is, 


A/o - .Vri. 


2.3, If r 1, we shall have 


A/, (Ay) = r,Z I(A-, - Xy)F.). 

Using A/i to denote, the total moment of order one abovtt the origin of the A", 
we obtain 

A/, (Ay) - Fi Z i) - Ay/-’y A/o. 


Making j vary from 1 to n and mimrniiig, we have 
(2.31) 


3ht - E - S Ay/’yA/o 

j-i )-i 


Mo Ml • MiMo ~ 0 


This result is due to the fact that wc took the doviations A, - A'jwith their 
proper signs. We may, howe.vcr, caleulatc the value which the complete, moment 
of first order would have if using absolute value.s. Thus, the sliding total 
moment thus modified becomes 


A/,(AM I - F, 


L^-i 


W- + E (A^ - Ay)/\. 


which may be put in the form 


(2.32) I Mi(Ay) 1 = Fy Ay 


E f=’.- - E - Pi TE Pi^t-t, FtX 



70 


MILTON DA filliVA BODHIOUKH 


Summing with respect to j and employing the substitutions 


(2,33) 


S f .• = Afo - £ Fi 

imj tanl 

£ FiXi = Ml-E FiXi 
1-1 


gives for the complete total moment 


(2.34) 



The quotient 
(2 35) 


mi = 


a»o 


of the complete total moment of order one by the complete total moment of order 
zero we shall call the complete unit moment of order one, or simply the complete, 
moment of order one, when no confusion would result, 

The complete unit moment is a measure of variability, identical witli that 
already considered by Andrae and Hclmert, respectively in 1809 and in 187C, 
and which C. Gini, in 1912, called mean difference with rejxjtition.* 

The numerator of mi is easily computed if we observe that the upper limit j ~ 1 
of the Fi summation, for example, means that each product XjFj must he multi¬ 
plied by the cumulative frequency corresponding to the class irnmediaUdy pre¬ 
ceding. We only have to shift the cumulative frequencies column by one claas 
in the proper direction; the second term is similarly dealt with, 


2.4. The second order sliding total moment is 

M,{X,) = FiE [(Xi - XfFA = FyA/a - H- F^XUh 

1*1 

where Mj is the total moment of order two. Summing -with rc.spc>ct to j gives 
the complete total moment of order two 

(2.41) m - E M,iX,) - 2(M,il/o - il/'(). 


The complete unit moment of order two is thoreforo 



“Apdd Czubeb, Wahrscheinhchkeitsrechnung, Vol, 2, (1932), p, 310. C. Gmi Varia- 
biliid, e Mutdbililh, Cagliari, 1912 ' ’ 



KXTKNHION OF MOMENT CONCEI>T 


77 


where v' .stands for a unit moment about the origin of the X, namely 

/ SXT 


lui is also a measure of variability, iudependent of the choice of origin. It is 
equal to the square of Gauss’s "Prfi,zisionBmasB”, and to the double of Fisher’s 
variance; like mj it woa defined by Andrae and Ilelmert, and was called byGini 
the moan aciuare difference with repetition. 


2,6. If r = 3 we have for the sliding moments, 

= FjMi - 3F,XjAh + ~ FyZjA/o . 

Summation over j gi\'ce 


(2.61) 


ft 


= S A/>(^/) 


Mo Mi ~ ZMiMo + SMoMi ~ M)^U = 0 , 


a result which is easily shown to hold for any complete moment of odd order. 
We may calculate the value of the complete moment of order three using absolute 
values of the deviations Xi — Z/ by a process similar to that previously described 
fOr the calculation of | UIli | . This gives 


1% 


(2.52) 


t £ F< - 3 ± FjX) g FiXi 

Iwil I**! 


+ 3 ± FjXj £ F.-Z; - ± Fj £ Fa\ 

,-l j-l >-l i«l 


2.6. The sliding momenta of order four are 

Mi(Xj) = F,Mi - 4FiX,Mi + GFjX^Mi - 4F,X}Mi + FfXjMo . 
Summing with respect to j and simplifying, we have 
(2.01) an* = MoMi - 4MiMt + 6A/? - 4Af,A/, + 

= - 4MiMi + 3Ml). 

Dividing both sides by SDlo in order to obtain the complete moment on a unit 
basis, wo have 

™ ^ [® - * ® K + Ks) ] " '■ + • 

But, if V indicates a moment about the mean 


= rl — Av'iVt + Gviv'i — 3*'i\ 



78 


MILTON n.V MLVA nODIUOOKrt 


By substitution, therefore 

lilt ~ 2{vi + Sv'i — Vivi'i/'i SvI*’) 

(2.62) = 2[n + 3(.'( - f?)'] 

= 2(»'^ + Si's). 


This complete moment gives rise to a measure of kurtosis independent of the 
choice of origin 


k = 


Tils 


y< 3 

2m|'^ 2' 


In case of mesokurtosis this reduces to 3, since for the normal curve v< v\ = 3; 
leptokurtosis and platikurtosis occur for the same ranges ns in the case of Pear¬ 
son’s measure . 


3, Aggregate moments of two frequency distributions. 

3.1. Given two frequency distributions, Fi\ii — 1, 2, 3, • ■ ■ , n) and 

D': [Xj, — 1, 2, 3, • ■ • , p) and a fixed point X'j belonging to the weond 

distribution, we shall call the expression 

(3.11) Mr{D, X',) = F; i: (X, - X'lYF, 

1-1 

the rth aggregate sliding total moment of the first distribution about the element 
X; of the second. Summation over j gives 

(3.12) “SJlr = i: E F;(X< - XfYF,. 

We shall call 'SUr the aggregate complete, total moment or, simply, the aggregate 
total moment of D about D', It is clear that this is a symmetric function of the 
two distributions, except for a change of sign in the case of odd moments. 

3.2. If r = 0, we have 

(3.21) M,{D,X'f) = F'i'tFi 

I'-i 

(3.22) ‘m, = £ F,- E F, = HUM ',, 

/-I t-i 


3.3. If r = 1, we have 


(3.31) M,iD, X'd = F'i Ml- F'i Xj M, 

(3.32) ‘aJJr= MjM; - AfoJW(. 
We shall call the quotient 


( 3 . 33 ) 


mj = 





KXTKNKKSN OF MOMENT CONCEPT 


79 


the aggregafn unit moment of order r (or the aggregate moment coefiiciont), 
or simply the aggregate moment of order r whenever the simpler name will not 
cause confusion. 

It is obvious that the aggregate, nroments are. measures of general similarity, 
as to form and position, between D and D'. This similarity will bo an identity 
in case the tw'o distributions eoincide perfectly; on the other hand, it i.s clear that 
there is no limit to the degree of non-simikrity which may he encountered, ^^‘e 
shall take unity to reprewnt the maximum and zero the minimum of similarity, 
and thus define a provisional similarity index 


(3.34) 

But 


mi mi 
mi 


mi 


MiM‘, - M,M [ 


= A 


where A and A' stand for the arithmetic means of D and D\ respectively. Now 
it will he 8M‘n that if A =■• A', *9 = «. This result i.s due.‘to the fact that in the. 
calculation of nti and m[ we took the absolute, values of the deviations ~ A'i, 
while in the calculation of 'mi we retained the algebraic signs. In order to make 
the two terms of the fraction in (3.34) comparable, w’e ean either: 1) calculate 
'mi also using absolute values; or 2) take only the positive or only the negative 
part of both numerator and denominator of »S', In any (*a,se, .4 « .4' is a neee.s- 
sary condition for the maximum of S, 


3.4. We shall employ the, first method suggested above, although wo shall 
return to the second in the third part of the paper. As long as D and D' do not 
overlap, all the X( — X'j deviations have the same sign and this is tlio aamt; as 
that of the difference A “ A'. If, however, there is some overlapping this will 
not be the case, some deviations having different signs from that of A - A', 
This brings us to Cuni’a concept of ‘Transvariation”, lie applies this term to 
any deviation Xi — X/ which does not have the same sign as X — X', these 
symbols denoting averages of any previously specified type; and he calls the 
magnitude of the deviation its "intensity”. 

In computing the complete moment of the fireb order usmg absolute values, 
in order to simplify the algebra we shall assume the same origin for X and X' 
and therefore drop the stroke from the X, but not of course from the P. 
If certain values of X occur in one distribution and not in the other, wo can 
merely consider the frequency as zero in the second distribution, In this way 
the two distributions can be regarded as extending over the same total range. 
If Xi and Xm denote the extreme values, the sliding total moment is 





= F]Xi 


Xi)Fi + ^ {X< ~ Xy)F;] 

( /-i " \ . /'-i 

i~\ i~i / \i-l 


FiXi - 




80 


MILTON D\, KILVA HODniaUKH 


Summing with respect to j and at the same time employing the subfitiUitionR 
(2.33) or their transposed form, we obtain the following alternative, expression.s 
for the complete aggregate moment: 


(3.41) 

(3.42) 


I'ajtil = MiMo - 
|ml = M,M[ ~ MiM‘, 


+ 2 , 



PlEP^X, 

^ 1*^1 

p m ^ m \ 

[^;X,E^rJ+2E 

Pif}PiXi 

•w ^ ■■ 


Note the similarity of the first of these forma to formula (2.34) which is in fact 
a particular case of formula (3.41). Alternatively, we may obtain from formula 
(3.42) the particular case 

(2.34a) 19J1:1 = 21 (fy E - 2 t p) 

jwml \ \ IK*} / 


which is equivalent to (2.34). 

If the two distributions do not overlap, j '50?i | docs not differ numpricrilly 
from '9I?i. Let us consider the case in which there is actual overlapping, the 
range of non-zero frequencies extending from Xi to for D and from A'„+i to 
Xm for D', Then formula (3.42) becomca, upon merely dropping ell venishing 
terms 


ml == Mo M[- Ml Mo 


(3,43) 


rt+j) p I—I 1 n4-pp nfp 

- 2 S F’iXf D F,. + 2 i: F', L .FiXi . 

L J /•■n+’l L ^ 


On the other hand, formula (3.41) reduces, under the same circumstances, to a 
much less simple expression, which upon making the substitutions (2.33) and 
simplifying reduces to 


ml 


n+p r J-I 

= MoM'i - MiM'o + 2 1) F[Xi D Ff 

/—n-f-l L n (1 J 

n+p p n+p n 

(3.44) - 2 i: i?; i; F(Xi 

I—n+l L <'•1 J 


n+p nt-p ii4 p B + p 

2 E p'i^f E /'\ + 2 E Pi E PiXi. 

y—B+l (“ti+l /•"B+l (wnl'l 


This result may be arrived at somewhat more easily by merely making the sub¬ 
stitutions (2,33) directly in formula (3.43). It may be noted that formula 
(3.44) at once reduces to the form (2.34) if the two distributions are identical, 
since the additional terms all cancel. It is, however a less satisfactory result 
than formula (3.43) because of the larger number of terras it contains. In order 
to obtain a formula which resembles (2.34) more closely, wc may reverse the 



EXTENSION OP MOMENT CONCEPT 


81 


order of summation in formula (3.43). 
collectively vanish, we see that 


'aiJil = ]\UM[ - MvM'^ 


Observing that the terms for j = i 


(3.45) 


h4*p 

2 L 

t~n+I 




p 

+ 2 2: 

n+i 




It will bo soon that the simple method of numerical computation described in 
section 2.8 is immediately applicable to all the formulas (3.41) to (3.45). Di¬ 
viding any of these expressions by ‘’aHo gives 1 °mi \. For example, if formula 
(3.43) is used, we have 


(3.46) \^mi\ = A'-A 


_2 

ilfoXfo 



F'iX, Z 



Substituting this value in equation (3.34), we have 


(3.47) 


iS'i = 


mi mi 

I "mi P 



a quantity which wo shall call the "mean coefficient of similarity." 

Wc noiv observe that tSi is a general measure, of similarity whose magnitude 
is affected by differences in either form or position. It may, however, bo de¬ 
sirable to eliminate the position element, in order to isolate the form aspect. 
To do this it will suffice to relate the value which | 'mi | would have for A = A\ 
to the product miml, This value of | ‘mi ] is, in fact, its minimum; denoting 
it by ‘m wo obtain the index 


(3,48) 


©I 


mi ml 



which we shall call the mean similarity ratio. 

It is clear that all the above mentioned indices measure overlapping as well 
as similarity. Overlapping between two distributions will be greatest when 
their similarity is greatest, or when [ ‘mi ] is a minimum, In order to bring 
out more clearly the overlapping aspect wc may follow Oini’s procedure of con¬ 
trasting the actual value of a measure with its maximum value. As already 
pointed out, if the form of the two distributions is held constant, but tlicir rela¬ 
tive position is varied, tlio degree of overlapping, as measured by tlie mean simi¬ 
larity ratio, is greatest when the arithmetic means coincide. This method of 
procedure is embodied in the index 

(3.49) !Ei = 

'mi 

which we shall call the "intensity of transvariation or overlapping.” To calcu¬ 
late Vi we may, for example, merely add the difference A' ™ A = c to the X 



82 


MIMON DA. aiDVA RODRIOUES 


values, in order to move D along the X-axis a distance of c, and then proceed to 
calculate | °mi | in the usual manner from the adjusted X values. 


3.6. If, in (3.11), r = 2, we have 

M^{D,x,) = p;-i:(x;-xj)’Fi 

= F'M, - 2X'iFiM, + X',^F',M,. 

Summing for j then gives 

(3.51) = M'oMi - 2M[Afy + . 


If we define the second aggregate unit moment as 


then 

(3.52) 


m 

'9W, 


' Ma k, 


= (t’ + v'* + (A - A‘)\ 


where the v and the A stand for the standard deviations and the arithmetio 
means of the respective distributions. Now we define the "mean Rcjuare co¬ 
efficient of similarity" as the value of 


(3.63) 


Sj = 


Tthnii 

Wj 


4<rV'> 


[<j^ + a'^+ {A - A')?’ 


It is obvious that a minimum value of Si requires that A «= A' as a necessary 
condition for the maximum degree of overlapping. Maximum similarity re¬ 
quires, in addition, a = a', in which case Sj = 1. 

For a measure of similarity which is independent of difference in position be¬ 
tween the two distributions, we define. 


(3,64) ^ 

where Vj is the minimum value of Vj for all positions of the two distributions, 
without changing their form. This is obtained by merely taking 

(3.55) V, = / + 

For a measure of overlapping we can follow Gini in contrasting the actual 



EXTENSION OP MOMENT CONCEPT 


83 


value of 'wa with its minimum Vs, since the maximum of overlapping corresponds 
to the minimum value of 'mi, We thus se. 


a’ + ff'2 + (A - Ay 


a measure which we shall call the "density of overlapping", Its maximum 
value is unity. 

It may he remarked that all the indices proposed in this paragraph are easier 
to calculate than those of paragraph 3.4. The individual terms are all functions 
of only one of the two distributions; yet the resulting indices arc independent of 
the origin chosen, and therefore free from any criticism based on doubt as to the 
representativeness of the arithmetic mean, in cases of marked skewness, 


4. Positive and negative moments, and moments of transvariation, 

4.1. The aggregate sliding total moment of two frequency distributions D 
and D' may be expressed, in the form 

(4.11) Mr{D, X'i) = F', i: (X.- - XyPi + F't t, (X, ~ Xy Ft 

1 — 1 

when both distributions have been artificially extended, if necessary, to cover 
the same total range, as previously described in section 3.4, We shall char¬ 
acterize the second term in the right member of (4,11) as the positive sliding 
moment, and the absolute value of the first term as the negative sliding moment. 
We shall denote these moments by Xj) and “114(1), Xy The complete 

momenta obtained by summing these separate terms over the range of values of 
j we shall call the positive and negative aggregate complete moments. Thus 
the positive complete moment is 

(4.12) "‘SDlr = Z r F'i Z {X, - Xy F,- 

,_i L ,_/+i 

and the negative complete moment is 

(4.13) -‘Mr = Z z - xy f.] . 

That one of these two partial moments which is obtained from differences X{ — 
X'f having the opposite sense to that of the difference X — X' will be called the 
moment of transvariation of the two distributions and will be denoted by the 
symbol • Here, as in section 3.4, X and X' denote averages of any pre¬ 
viously selected type. For example, if the arithmetic means arc the averages 
selected, and if A — A' is positive, then the negative aggregate moment is tlm 
moment of transvariation, and vice-versa. 

In the trhdal case in \vhich the two distributions are identical, the positive 
and negative complete moments are equal, and both reduce to merely one half 



84 


MILTON DA 81LYA HODRIGUES 


the aggregate complete moment (computed by the use of absolute values in the 
case of moments of odd order). 

The unit moment of transvariation will be defined as 


( 4 . 14 ) 



4.2, It is evident that the momcnte of transvariation can be considered as 
measures of overlapping. Any such moment eniuils zero when lh(*re is no over¬ 
lapping and becomes greatest when the two distributions coincide. Taking unity 
to represent the maximum and zero tlie mmimuin of overlapping, we may choose 
as a general measure of overlapping, 


(4.21) 



Wr||mJ 



It will be seen that this quantity always eiiuals zero when there is no overlapping, 
and equals unity when there is complete ovcilapping; that is when the two dis¬ 
tributions arc identical, 


6. Need for further developments, All of the measures above deijcribed 
were defined for the case of finite sets of magnitudeH, expressed as freciuency 
distributions D and d', Now these sets of magnitudes may be thouglit of as 
samples drawn out of their corresponding univcrs(!S. The consideration of tha% 
universes would lead to more general representations under the form of freiiuency 
functions, and the above measures would be expressed as definite integrals rather 
than summations. This draws attention to the need for teste of significance of 
the magnitude of all the above measures, especially those of overlapping, in 
order to allow for sampling fluctuation. Obviously, when the frequency func¬ 
tions are of the asymptotic type some amount of overlapping will always exist. 



ON A PROBLEM OF ESTIMATION OCCURRING IN PUBLIC OPINION 

POLLS 

By Hbnky B. Mann 
Ohio Stale Universtiy 

To arrive at an estimate of the number of electoral votes that will be cast for 
a prCsSidential candidate a poll is taken of hN interviews in the I'th state (i = 1, 
• • • , 48) where the are fixed constants > 0 such that 2X, = 1 and the re¬ 
spondent is asked for which candidate he intends to cast his vote. To estimate 
the number of electoral votes which candidate A will receive, the electoral votes 
of all the states in which the poll shows a majority for candidate A are added 
and their sum is used as an estimate for the number of electoral votes which 
candidate A will receive. In this paper certain properties of this estimate will 
be discuased. It will be shown that it is a biased but consistent estimate and 
an upper bound for the bias will be derived. Finally we shall derive that dis¬ 
tribution of interviews which minimizes the variance of our estimate. 

In all that follows we shall consider the poll as a random or stratified random 
sample and shall disregard the bias introduced by inaccurate answers. Our 
results however remain valid as long as the sampling variance is proportional 

We shall use the following notation: 

IT, = proportion of voters in the fth state who intend to vote for candidate A, 

6i = 1 if TT, > ^ 

0 if iTi < ^ 


Wi = number of electoral votes of the ith state. 

Pi, = sample values of tt,* and resp. 

We shall further exclude the case tt. = i. 

The number of electoral votes for candidate A is then given by 

As an estimate of F we use the quantity 

( 1 ) 

Let Pi be the probability that pi > ^ and hence si == 1. Let hN = be the 
number of interviews in the ith state. If N{ is not too small then pi is given by 


Pi 







S5 


( 2 ) 



86 


HENRY B. MANN 


In this formula <r, ~ if the sample is an uustratified random 

sample and may be somewhat less if the sample is a stratified random sample.^ 
For our purposes it is sufficient to assume that <r, is protmrtional to ■ 

We then have E(c,) = Pi and 

(3) , S(G0 => 15(2;fc}* e* xot) » 2-" Pi w.' • 

Hence (? is a biased estimate of T. On the other liand* plim p, vi and 

hence plim e, == «i and therefore plim G « T. That is to say G is a eon- 

sistent estimate of T. 

According to (3) the bias is given by 

(4) B{N) = 2w' tito. - 2-{* P.-Wf » 2i-J' (« “ Pi) . 


We have 




Ohi 


e”**' dx if Tv < ^ 


r>*‘ dx if TV > I 


For a stratified as well as for an unatratificd sample (r< is projxirtional to 
and we therefore put 


Vm 


(5) 


TV 


his/Ni if T( < i 
\— yi\/Wi if Tv > i ’ 

1 r 


dx. 


Then we have in both cases 

(6) 1« “ pc I 

We have for a > 0 

f e"*** dx < ^ g-f («+»*)* ^ 

•■a 

< h(l + e-"* + + ■ • • ) 




1 - e 




for every value h. 
Since lim 


A -.0 1 - e' 




1 


we have 


(7) 


/■«' 

■a 


-la» 

a 


**“ dx < ^—■ for every a > 0. 


* The variance in public opinion polls ib somewhat larger than the random sampling 
variance due to the fact that a cluster sample is used and not a random sample, For the 
same reason the estimate p. of tv may be biased. 

•For the notation used here see; H. B Mann and A. Wald, "On stoohnstio limit and 
order relationships". Annals of Math. Stal., (1943), pp. 217-227. 



A PROBLEM OF ESTIMATION 


87 


From (6) and (7) we obtain 


( 8 ) 


I «f - Pi 1 


■y/2irNi 7. 


From (4) and (8) we have 


(9) 




-48 e 

W{ 




Formula (9) is valid whenever r, ^ and shows that B(N) converges rapidly 
to 0 for all values tt,- 

To obtain an approximate idea of the magnitude of the bios we may in (4) 
replace e, and p; by their sample values e, and r, . The quantity 
(e, — rd can, however, not be regarded as an estimate of 
We now proceed to compute the standard error of G. We may consider the 
poll as 48 single experiments where the probability of success in the fth experi¬ 
ment is given by pi where 


1 

'\/2ir 


f 

•'Ti 


VTi 


e-‘*’ dx 


(pi if IT, < ^ 

- Pi if Tf > i ‘ 


Hence the variance of 6* is given by 

(10) tr^ = Pi (1 ■" pOw’? • 

As an estimate of t/ we can use the quantity S' obtained by replacing pi by 
its sample value. 

We shall consider that distribution of interviews as best which minimizes 

F[((? - D']. 

We have 

F[((? - D'] - <r* -b B\N) 

We therefore consider the problem of minimizing tr' -f- under the restric¬ 
tion Ni = N. 

We have 


dcr' 

dNi 

dB^{N) 

~wr 


3(r' dpi 
dpi dNi 


Wi (1 


2p() 


dpi 

dN, 


2B(N) 


dB{N) 

dNi 


■2WiB(N) 


^Pi 

dN, 


dpt 

dNr 


V2i, 




7< 


2VNi 




if 7r4 < ^ 


ir< > ^. 



88 


HENRY B. MANN 


Hence applying the method of Lagninge operators, nc obtain 

(11) ^ (1 „ 2pt) - 2/?(iV)] - 

^ ^ dN, 3N{ 

N. = A'. 


• 18 . 


The parameters yv and ir; in equation (11) can be estiiimtod from a previous 
poll.^ It is not certain that (11) has always aolutions. However if the quantity 
O'* + B^(N) has a minimum for a set of valuc.s A’^i, • ■ • , N^u with iV* 'A 0 (i 1, 
■ ■ ■ , 48) then (11) must have a solution 
One might be induced to try to estimate 2) p^\ directly by using r, = 
1 r* 

dx as an estimate of pt. It is easy to see that r,- is a eon- 

V27rJci-P,Vu . 

sistent estimate of <,. It will be shown however that this estimate is more 

biased than the estimate (1). 

Since ir; differs only very little from its sample estimate «, we may replace this 
sample estunate by o-;. We then have 





1 /•-f* f*‘ 

2v<u 




Now 


(k - Pi)’ + (Pi - IT,)’ = + 2 ^p( - . 


Hence 


The second integral is equal to Vit^. Hence 


E{Ti) 


2 Vir<r’ iI "S/ 2Tri{j_ 


rp/^iV* 




’If Ti for any i were very close to 1 then it would bo of little uee to poll the ith atato. 
Hence, in this ease formula (11) givea a small value for iVi. However, the n are never 
accurately known. The following procedure might be recommended for determining the 
best distribution of interviews: If for one particular t the sample value of r< a« estimated 
from a previous poll is too close to J determine, using the Ni of the previous poll, that value 
VI of T< for which the probability is that pi is larger than J and substitute in (11) vf 
for T|. In all other cases substitute the sample value. 

If several polls are taken it is advisable to use all of them but the last one to estimate 
as closely as possible the values of the ti , The sample of the last poll before the election, 
should be distributed according to (11). 



A PHOBLEM OP ESTIMATION 


89 


From (12) we see that E{r^ < />, if t, > | and E{r^ > pi if tt, < §. 

Thus in every case this estimate is more biased than the estimate (1). 

On the other hand, we shall now show that F'[(«, — r,)^] is always smaller than 
■E[(«i — (ii)\ Since «, = 1 if t, > ^ and = 0 if ir, < ^ it is easy to verify that 
— ?"<)^] has the same value for tt, = a as for tt. = 1 — a and the same is true 
for ^^[(e, — e,)“]. We may, therefore, without loss of generality assume that 
Ti < f 

Thus we have to show that 


(13) E{r\) < E{e\) = p.- = f" dx if tv < h 

We have 






VStT / 

1 1*00 ^ao -I 


Now 


Q(*, V, Pi) = (® — Pxf + (y — p>f + (p< — T,)’ 

Putting 




Vi = 


y' = 


<Ti ’ \/6 

1 (x - y) 1 — 27r v 
<r< ’ ■y/da'i 


(x + y - 2iri) 


ITi 


= a, 


we obtain 




2t Ja Jvic-*) 

Now for Ti = § we have a = 0, and for tv < ^ we have a > 0. For a == 0 we 
obviously have^?(rj < E(e\), Further lim E(rl) = lim F(e?) = 0 hence (13) 

d“**CC 

is proved if we can show that 


Fia) = E(r^i) - E(e^,) = ^ £ 


-j*> fV»cx-) _ 


J 




e dy dx — 


1 

\/2t 


/ "O 

Vl“ 



90 


HENRY B. MANN 


is a monotonically increasing function of a. Differentiating Fia) with respect 
to a we obtain 

dF(a) _ Vo r -K4i»-6ar+j«>) i V3 g-WOo’ 

da TT ia 2V T 

^41 = g-(WO«‘ r g-4(-[3/0«)‘ ^ ^ vl g-wo«» 

^ ^ T J# 2Vir 

ft-''’* 

J# 2Vw 


Vo 

27r 


Hence for a > 0 we have 


^ e"**’ + ^ > 0. 

da 2V T 


Hence we have proved 


A/l(i-lal) 


it <£((<•■-£,)’), 


(16) 


Since 


a = 


V}(|a|-#) 

1 -In 

V6p-|' 


- e/f] - £[(*< - 4*] 

is largest when iri = ^ we also have 
B[(*. - r.)1 >\u-Pi\- 


1 _ 

"1 

1 



.2 

2 ir • 

*0 , 


or 

(16) 1 €< - w 1 > E[{(, - r,)“] ^ dy di - I i “ Pi I . 

Because of (15), r, although more biased may in many oases be preferable 
to e, as an estimate of f,. 



NOTES 

This section is devoted to hnef research and expository articles, notes on method¬ 
ology and other short items. 


A COMBINATORIAL FORMULA AND ITS APPLICATION TO THE 
THEORY OF PROBABILITY OF ARBITRARY EVENTS' 

By Kai-Lai Chung and Libtz C Hsu 
National Southwest Associaied University, Kunming, China 

An important principle, known as a proposition in formal logic or the method 
of cross-classification can be stated as follows.' 

Let F and f be any two functions of combinations out of (v) = (1, 2, ■ • • , n). 
Then the two formulas 

(1.1) F((a))= 2 /((«) + («) 

(0) < lr)—(a) 

( 2 . 1 ) m) = 2 + m 

(fl) « 

are equivalent. 

As an immediate application to the theory of probability of arbitrary events, 
we have the set of inversion formulas* 


(3.1) 

p((«)) = 

L p[(«) + (^)] 



(fl) € (»)-(<.) 

(4.1) 

p[(a)] = 

z (-i)'p((«) + m 

05) « (»)-(a) 


where ?((«)) is the probability of the occurrence of at least , Af,,, • • - , 
out of n arbitrary events E\, E 2 , ■■ , En and p[(a)] is the probability of the 
occurrence of Eai, Ea^, • • • , Ea^ and no others among the n events, (ai, ai, 
■ ■', Oa) denoting a combination of the integers (1, 2, ■ • ■ , n). They can bo 
made to play a central r61e in the theory, since they supply a method for con¬ 
verting the fundamental systems of probabilities, p[(a)] and p((a)), one into the 
other. 

We may further generalize (1.1) and (2.1) by considering combinations with 
repetitions. Let such a combination be written as 

(a) = {a) = (al‘a5’ • • * al") 

1 For the notations and definitions see K. L. Chunq, "On fundamental systems of prob¬ 
abilities of a fimte number of events,” Annals of Math, Stat., Vol. 14 (1943), pp. 123-138, 

’ Cf, FaficEBT, Lea probabiliita associies d im. ayalhme d’ivtnemenla compatibles et dtpen- 
danta, Hermann, Paris (1939), formulas (65) and (68). 

91 



92 


KAI-LA,I CHUNG AND DIETZ C. HSU 


where r* (r, > 1) denotes the number of repetitions of the number a,, z = 
1, 2, • • ■ , G. Correspondingly we write 

(a)' = («ia2 ‘ t^) 

and call it the reduced combination corresponding to (a). 

If there are n distinct elements (1, 2, • • , n) in question, ive may unite every 
combination in the form 

• ■ • n^") 

where each r; is zero or a positive integer. We say that (1*‘2'* - • ■ n'") belongs 
to • • • n*"") and write 

(1*‘2'* - • • n'") e (r'2'* • ■ • n'") 

if and only if for each i, i = 1, 2, • * • , n, we have S( < r,. We write 

• • • 71’') + (1'‘2'‘ • • • n'-) = 

andif(r'2*’ ••• n'") e(r‘2'* 

■ ■ • vf") - (1**2'’ . • • n'") = .., n''-'*-). 

We define a generalized Mobius function /u((«)) for combinations (with or with¬ 
out repetitions) as follows 



This function has the property 


For we have 


W i M 


m) 


1 if (a) = (0) 
0 if (a) 7^ (0). 


z m)- 

W * (o) 


s 

m • («)' 


(-i)‘ = 2 (-i)‘ 

{N-O 



^1 if a' = 0 ^ 1 if (a) - (0) 

0 if a' 0 0 if (a) (0). 

Now we state and prove the following general theorem. 

Theorem. Let (a). == and (»»)< = ... nV“0 

where \n and ni are finite and 1 < r<y < Xo , 1 < a{ < m for i I, 2, ,m 
and j = 1, 2, • • , n,, Then for any two functions of the m combinations {wiPi 
repetitions), (a)i, {a) 2 , • • • , (a)m out of (i')i, («)j, • • • , ()/)„ , the two sets of 
formulas: 


, («)2 


■ {«)«) 

S + (^)i I Ws + 0)2 

(p)i < (■')(-(«), 


■ ■) (u!)m “h (fi)m) 


(1) 



A COMBINATORIAL FORMULA 


93 


and 


/((«)i. (“)2)' ' ■ ) (“)") 


( 2 ) 


~ m n 

= 2 II Jp’((«)l + (i3)l , (“)2 4- ( 0)2 J ■ • ■ , («)m + (^)m) 

(/5)< < L>“1 


are equivalent. 

Proof. To deduce (2) from (1) 




1 « (►),—(a), L'“l J 

E rn^m)] E , 

(Pit ‘ (>').-(«)• L‘-i J (yii • (r)i-(‘‘)i-(PU 


•/((“)x + (| 8 )l + (7)1 , • ■ (a)fn + (fi)m + (T)m) 

E /((“)l + ( 5)1 , • • -1 (a)m + (fi)m) 

(<)< « <»)i—(a)« 

■ E EmW—MO. 

(r)» • (t)i '-i 


Evidently we have 


E - ( 7 ).) 

(t)» < (l)i »-l 


m 

Si 


E m((5).' 

(y)t t («)i 




1 if ( 5)1 = (0) for 
0 othenvise 


1 , 


by the property of the ^-function. Hence the preceding sum reduces to 
/((«)i, ■ • ■ ) (“)m) in accord with (2). 

(1) is deduced from (2) in a similar way. 

Although the general case is not without importance in the treatment of 
several sets of events,' we shall for the sake of convenience restrict ourselves to 
the special case m = 1. 

In order to apply these formulas we must first introduce combinations mth 
repetitions into the theory of arbitrary events. This can be done in various 
ways. Firstly, we may consider the number of occurrences of each event in a 
given time-interval or in a series of trials not necessarily independent. Secondly, 
we may regard each event as possessing various degrees of intensity. If the 
event Ei occurs Vi times in a given time-interval or occurs with ti degrees of 
intensity, we write it as E\\ Hereafter we shall make use of the first interpreta- 


’ Cf. FrAchet, Loc Cit pp 60-62; also, K. L, Chung, “Generalization of Poincarfi'a 
formula in the theory of probability,” Annals of Math. Slat., Vol. 14 (1943). We may note 
that our general theorem may be used to give another proof of the generalized Poincarfi’s 
formula for several sets of events 



94 


KAI-LAI CHONG AND METZ C. Hfet: 


tion and we shall assume that the maximum number of occurrcnees of each event 
is finite: 

0 g r.' ^ X., t = I, ■ • ‘ , n, 

We define 

'p[E[^ ■ ■ E'^] = ?;[(»'')] = Tlie probability that occurs exactly r,' 
times in the given time-interval. 

p{E[' ■ ■ • E'n) = pi(v)) = The probability that E, ocours at least re 
times in the given time-in tfsrval. 

These quantities play the same r61e as the pl(a)l’s and p((a))'8 in the ordinary 
theory. Evidently the probability of every complex event in (lucstion can be 
expressed as the sum of certain p[(/)]’s. To prove that the p({/))*s also form 
a fundamental system of quantities we have only to expre-ss p[(v'')]’s in term.s of 
the p((/))’s. This is given immediately by an application of the general 
theorem with m = 1, For ive have in an obvious way 

p(El' ■.. 5 ;") = Z Pl^i' ■ ■ ■ 

^1 S ^ 


or 

(3) 


?((/))= E ?[</) + (/)]= E, 


■)]. 


Hence we obtain the inversion 


(4) p[(/)]= E ^((•'‘))p((/) + (0). 

Let (<*') denote a running combination without repetitions. Then since (»'*)) = 
0 unless (v) is a (v''), 

(40 p[(/)]= E, m((«0)p((vO + {«'))= E, (-irp((v0+(«0)' 


The set of formulas (3) and (4) generalize (3.1) and (4.1). 

Corresponding to the p[ai((v)) for the ordinary events we define for a -f- 6 -|- 
• ■ • = n and r, s, • • • all distinct: 

PWM6J*,. ■ (-Bi^ • • B),") = The probability that among n events Bi, Bj, 
■ • • , B„ exactly a events occur r times, exactly b events occur s times and so on. 
By (4) we easily obtain 


= E , E m((0)p((v*) + (aY 4- (^)* + • • •) 


where (a)'' = (Ba, • • ■ (/3)' = (BJj • ■ • • ■ ■ and the first summation 

is a symmetric sum which extends to all n}/albl • ■ • different comhinations 
(«!••• a<i), (|Si •' ■ /3t), ■ • • out of (v) = (1, 2 • • ‘ n). 

The equality (5) is obviously a generalization of Poincare's formula. 
Similarly for the probabilities in the definition of which the word “exactly" 



MECHANICS OF CLASSIFICATION 


95 


is sometimes substituted for the words “at least.” Of course we can express 
all of them in terms of the p[(/)]’s or of the p((j''))’s- However elegant formulas 
such as in the ordinary theory seem to be lacking. 

Finally, we may also consider conditions of existence for the p[(v")]’8 and the 
p((/))’s For the former system the conditions are that they be all non-negative 
and that their sum be 1. For the latter system, the conditions are given by 
(40, viz for every (/) t (v^), 

E /^((« 0 ) p ((>'0 + (^)) S 0 . 

(o') I 

These conditions are necessary and sufficient since (3) and (4) are equivalent. 


ON THE MECHANICS OF CLASSIFICATION 

By Carl F. Kossack 
University of Oregon 

1. Introductioa. Wald‘ has recently determined the distribution of the 
statistic U to be used in the classification of an observation, z, (t = 1, 2, • ■ • , p), 
as coming from one of two populations. He also determined the critical region 
which is most powerful for such a classification. It is the purpose of this paptir 
to show how such a classification statistic under the assumption of large sampling 
can be applied in an actual problem and to present a systematic approach to the 
necessary computations. 

The data used in this demonstration are those which were obtained from the 
A.S.T.P. pre-engineering trainees assigned to the University of Oregon. The 
problem considered is that of classifying a trainee as to whether he will do un¬ 
satisfactory or satisfactory work' in the first term mathematic.s course (Inter¬ 
mediate Algebra). The variables used in the classification are: (1) A Mathe¬ 
matics Placement Test Score. This is the score obtained by the trainee on a 
fifty-minute elementary mathematics test (including elementary algebra), 
The test was given to each trainee on the day that he arrived on the campus. 
(2) A High School Mathematics Score. A tramee’s high school mathematics 
record was made into a score by giving 1 point to students who had had no high 
school algebra, 2 points to students with an F in first-year, high-school algbra 
and no second-year algebra, 3 points for a P, • • ■ ,10 points for an average grade 
of A in first- and second-year algebra. (3) The Army General Classification 
Test Score. An individual needed a score of 115 or better in order to be assigned 
to the A.S.T.P. These data were obtained for 305 trainees along with the actual 

^ Abbaham Wald, “On a statistical problem arising in the classification of an individual 
into one of two groups," Annals of Math. Slat., Vol. 15, (19-14), Xo. 2. 

* Unsatisfactory work was defined as a grade of F or D in the course (failure or the low cst 
passing grade) 



96 


Carl f. koksack 


grade made by them in the algebra courpe. Trainees had had college work 
were not included in the study. 


2. Steps in the Computatioa of U and the Critical Region. Let 
TTi be the population of individuals who do unsatisfactory -work in their firsL 
term mathematics course. 

Tj be the population of indiiuduals who do satisfactory work. 

Ni and = respectively the number of observ'cd individuals in and ttj . 
Xu and yu - respectively the Mathematics Placement Test Score for the 
ath individual observed m tti and ti . 
a: 2 „ and yu = respectively the High School Mathematics Score, 
xja and yia = respectively the Army General Classification Test Score. 

Step 1 Compuiahon of Suminalions 


Ni = 96 
= 3570 

" Xxu = 547 
Sxja = 11745 
2i?„ = 145476 


SxL = 3509 
= 1439559 

2xi«xj« = 21012 

2xi^j„ = 436964 
2x2o^ja = 66731 
S(xu - £i)’ == 12716.625 
2(i2. - £ 2 )* = 392.240 
2(x8, - Sif = 2631.656 
2(xi„ - £i)(x2o. - fj) = 670,438 
2(xio — £i)(x 3 « - £ 3 ) = 196.812 
2{X2a - £2)(x3a “ £ 3 ) = “191,031 


Ni = 209 
= 11450 

a 

2j/2tt = 1567 
2|/,„ = 2668^1 
== 672452 
Xyl = 12577 
2yJ, * 3421996 
^ViaVia « 88774 
22/uy3. « 1469302 
^ViaVu = 200150 
2(2/i» - - 45167.311 

2(^2. - = 828,249 

2(z/3, - Vif = 15125.876 
Avia ~ yi){yu - ff*) = 2926.392 
Avu - yOfys- - Vi) = 7427.359 
2(i/ 2« - y2)(yi. - jii) = 83.837 


Step 2. Compulation of Statislics, 


£i= 37,188 
Xi = 5.6979 

xi - 122.3438 


Vi = 54.785 
yi = 7,4976 

Vi = 127.6746 


_ 2(xio Xt)(xja ~ £/) + 2(y,'tt “ y<)(yj« Vi) 

s<, iVi + JV, - 2- 


su = 191.04 
S 22 = 4 0280 

S83 = 58 606 


Sn = 11.871 
Si3 = 25.162 
523 = -.35378 



MECHANICS OP CLASSIFICATION 


17 


Ste'p 3. Computation of Inverse Matrix | s''’ | 


5., = 


191.04 
11 871 
25.162 


11.871 
4 0280 
-.36378 


25 162 

-.35378 

58.606 


= 34053 


s“ = .0069286 s*" = -.020692 

s” = .31019 s“ = -.0030996 

s'' = .018459 s" = .010756 


Step 4. Computation of the Classification Equation. 

U = [s''(yi - Xi) + s^^iyi - Xi) + s“(y3 - xi)]-zi 

+ ~ ^l) + ^^^(§2 — Xi) + s"(ys — X^]-Z% 

+ ~ ^i) + ~~ xf) -f- s”(v» — £>)]-Zi 

■where 2 , plays the same role for individuals to be classified as x,v and do for 

observed individuals. 

U = .068160 2i + .25147 Zj + .063215 Za 
Step 5. Computation of the Critical Region (assuming Wi = Wj) 
ai = .068160 + .25147 x^ + .063215 x, = 11.702 

52 = .068160 yi + .25147 yj + .063215 yi = 13.691 

"t" Sa) — 12.696 

Therefore, 

For U < 12.696 classify the individual as coming from ti population. 

For U > 12.696 classify the mdividual as coming from n population. 

Step 6. Computation of the Efficiency of Classification. 

X = s“(yi - Xi)(yi - xi) + 8“(5 i - Xi)(y2 - if) + s“(yi - 

+ s”( 52 - ^2)(yi - ^i) + s”(y2 - Xi)(y2 - xf) + s*'(y2 - Si)(y, — St) 

+ s"( 53 - X3)(yi - xi) + s"(i/3 - S,)(y 2 - Sf) + s”(j^3 - «»)(l?a - ia) 

= 1.5764. 

.792 

Pi = 1 _ Pj = -7^ r e""'* » .2062 

V 2t J.7»2 

where Pi is the probability of making an error of Type I, that is, of classifying 
an individual as one who ■will do satisfactory work when he actually does un¬ 
satisfactory work; and 1 - P 2 is the probability of making an error of Type II, 



98 


T, A. BANCnOFT 


that ia, of classifying a student aa one who will do unsatisfactory work when he 
actually does satisfactory work. 

3, Conclusions. In using the above classification equation to classify the 
306 trainees used in this study, 21 errors of Type I were made or 22.9 percent, 
while 60 errors of Type II were made or 23.9 percent. These percentages seem 
reasonably close to the expected 20.6 percent. 


NOTE ON AN IDENTITY IN THE INCOMPLETE BETA FUNCTION 


By T. a. Bancroft 
Iowa Stale College 


Since the incomplete beta function has proved of some importance in statistics, 
it would appear that any additional information concerning its properties might 
at some time prove useful. In a paper by the author, [1], two identities in the 
incomplete beta function were incidentally obtained. They are as follows: 

(1) iV + q)Is{v, q) = pL{p + 1, ff) + qLip, ? + 1) 

and 


(2) (p + q+ l)"’7,(p, «) = (p + l)'”/,(p + 2, g) + 2pg/.(p + 1, g + ]) 

+ (p + ? + 2), 

where the incomplete beta function h{p, q) = , etc., and (p 4- I)'*' 

B(p, g) 

etc. refer to the standard factorial notation. 

Written in the above form these two identities suggest a possible general 
identity to which they belong aa special cases. The third special case suggested is: 

(p + g + 2)'’'7,(p, g) = (p + 2)'”7.{p + 3, g) 

(3) + 3(p + l)‘''g7*(p 4 2, g + 1) 4 3p(g 4 l)'”’/,(p 4 1. g 4 2) 
4 (g 4 2)‘’'7,(p, g 4 3). 

The general formula suggested is 

(4) (p 4 g 4 n - I)'"'7, (p, g) = 2 (p + „ _ r - 

(g 4 r - 1)*'' 7* (p 4 n. - r, g 4 r). 
To prove the general formula we write (4) as 

(5) (p 4 g 4 n - l)‘”' 7. (p, g) = S (p 4 n ~ r - I)'"-'’ 

. ^ _ nir] B„(p 4 n - r, g 4 r) 

B(p 4 71 - r, g 4 r) 



AN IDENTITY 


99 


( 6 ) 


By expanding and simplifying it is easy to show that 

(p + n - r - (q+T- ^ jp + q + n- 1)'"' 

B(p, q) 


B(p + n — r, q + r) 

Using (6) the right hand side of (5) reduces to 


(7) 


(p + g + n - 1) 


("1 




B(P, 5) 

The summed function in (7) reduces to 


(p + 71 - r, g + r). 


( 8 ) 


[ (1 - xy ^ [a: + (1 - x)]" dx = B* (p, q), 

Jo 


which proves the identity. 

Although the general identity is quite simple to prove, it does not seem to 
have appeared in the literature 


REFERENCE 

[1] Bancboft, T a. “On biases in estimation duo to the use of preliminary tests of sig¬ 
nificance," Annals of Math. Stal , Vol. 16 (1944), No 2 



NEWS AND NOTICES 

Readere are invited to submit to the Secretary of the Institute news items of interest 

Personal Items 

Archie Blake is now employed aa a ballistieian with the BalliKtic Research 
Laboratory at Aberdeen Proving Ground, 

Robert V. Bonnar is now employed iis ;Vasociatt* Tcchiiologi.st at the Mare 
Island Navy Yard. 

Professor W, G. Cochran has returned to his regular duties at Iowa State 
College 

Mrs Bij^nca Cody (Bianca Rivoli) is now Statistician for the James 0. Peck 
Research Company, 12 East 4l8t Street, New York City. 

Associate Professor William Feller of Bro^vn University has been appointerl 
Professor of Mathematics at Cornell University. 

Professor John Kenney of the University of Wisconsin is now located at the 
Milwaukee branch of the University. 

Myra Levine is now Assistant Mathematical Stati.stician with the .Statistical 
Research Group at Columbia University. 

Mrs. Harold Michaelis (Ruth E. Jolliffc) is 5th Naval District vStatistioian 
at the Naval Operating base in Norfolk, Va. 

Emma Spaney is Statistician for the Committee on Measurement of the 
National League of Nursing Education. 

Professor J. A. Shohat of the University of Pennsylvania died October 8,1944. 

Mr. Redford T. Webster of the Western Electric Company died July 31, 1944. 


New Members 

The following persons have been elected to membership in the Institute: 

Boddle, John B., Jr. Chief, Program Section, Budget Division, ‘Washington, D. C. S6t8 
Tunlau) Road, N.W, 

Bruner, Nancy M A. (Iowa) Statistician, Western Auto Supply Co,, Kansas City, Mo, 
7B11 Mavn St 

Christopher, Edward E. B.S. (Mass. Inst. Tech ) Statistician, Signal CJorps. B70i North 
SSlhSt., Arlinqton, Va, 

Cowden, Dudley J. Ph.D. (Columbia) Prof, of Economics, Univ. of North Carolina. 
Sox SIB, Chapel Hill, North Carolina. 

Cynamon, Manuel M.S, (City Coll., N. Y) Personnel Tech., Personnel lies. Sec., Adj. 

General’s Office, War Dept. iO Ave. P, Brooklyn 4, N. Y, 

Evensen, Edward J. On military leave from Metropolitan Life Ins, Co. (Actuarial Sec,) 
Sv, Co,, 1st Sp, Sv Force 

Green, Earl L. Ph.D. (Brown) 1st Lieut., A.C., Chief, Dept, of Statistics. AAP School 
of Aviation Medicine, Randolph Field, Texas. 

Groves, WlUlam Brewster B S, (Antiochl Economist, Oft, of Price Administration. 
Decatur St , N.W , Washington, D. C. 

100 



NEWS AND NOTICES 


101 


Homseth, lUchard Allen MA (Wiaconein) Res Assistant in Sociology, Univ. of Wiscon- 
sin. S07 N. Randall, Maduon B, IVig. 

Klnsler, David M. M A. (Chicago) Chief, Analytical Section, Arms & Ammunition Divi¬ 
sion, Aberdeen Proving Ground, Maryland 

Eopp, Paul J. M A. (Duke) Major, Chemical Warfare Service, U. S. A. ISOS North 
Adams St., Arlington, Va. 

Massey, Frank Jones, Jr. M.A (California) Aasooiale, Dept, of Math , Univ. of Cali¬ 
fornia, Berkeley, Calif. ISBJ^ Union St , San Francisco 9, Calif. 

Orcutt, Guy H. Ph.D. (Michigan) Inatr. Economics Dept., Mass. Inst, of Tech., Cam¬ 
bridge, Mass. 

Rakesky, Sophie M.S. (Michigan) Statistician, W. K Kellogg Foundation, Battle Crook, 
Mich. 

Roberts, Jean M.S. (Minnesota) Statistician, Child Welfare Res. Analyst. 919 Good¬ 
rich Are., St Paul S, Minn. 

Schletroma, William B.S.S. (Coll, of City of N Y.) Research Assistant. 316 East lIBlh 
St , New York, N Y. 

Schlorek, Mary A. A B. (Adelphi) Research Statistician, National Broadcasting Co., 
30 Rockefeller Plaza, New York, N. Y. 

deSousa, Alvaro Pedro B.E. (Liverpool) Vice-Governor, Banco de Portugal. Monserratc, 
Rua Infante de Sagres, Estoril, Portugal. 

Steele, Floyd George MS (Calif. Inst, of Tech.) Stat. Analyst, Douglas Aircraft. 18168 
Roosevelt Highway, Pacific Palisades, Calif 

Thom, Herbert C. S. 613018th Rd , N., Arlington, Va. 


Report of the Fifth Kttsburgh Chapter Meetini; 

The fifth meeting of the Pittsburgh Chapter of the Institute of Mathematical 
Statistics was held at Engineering Hall, Carnegie Institute of Technology on 
Saturday, November 25,1944. The meeting was held as a joint session with the 
Pittsburgh Quality Control Society. Thirty-one persons attended the meeting, 
including the follo^ving six members of the Institute: 

George Eldredge, H. J. Hand, C. R. Mummery, E G Olds, E. M. Schrock, J. V. Sturte- 
vant. 

The following papers were presented, with Mr. J. V. Sturtevant, of the Car¬ 
negie Illinois Steel Corporation, acting as chairman: 

1. Modified Application of Control Chart to the Use of Gauges on Machine Tool Work. 

Dr E. G. Olds, War Production Board, Wauhington, D. C. 

2. Application of Control Charts to Infrequent Inspection of Machine Operations. 

W. D. Angst, Thompson Aircraft Products Company, Cleveland, Ohio, 

3. Application of Control Chart Techniques to Checking Reproducibility of Chemical 
Analysis. 

H. A. Stobbs, Wheeling Steel Corporation, Steubenville, Ohio. 

4. Stalislical Principles of Experimental Design as Applied to Tests Conducted in Manu¬ 
facturing Operations. 

Dr. B Epstein, Westinghouse Electric & Manufacturing Co , East Pittsburgh, Pa. 

H. J. Hand, 

Secretai-y-Treasurer, PiUshurgh Chapter 




102 


NEWS AND NOTICES 


Educational Meetings of the Pittsburgh Chapter 

The first of a series of educational meetings on imUhods of fitatistiral roinputa- 
tions given by the Pittsburgh Chapter was held on Saturday aftemoon, Januaiy 
20,1945, Thirty-three persons attendeil the mMing, inehuling the following 
three members of the Institute: 

Thomaa A. Elkins, H. J. Hand, J. V. Slurtevanl, 

The following program was presented; 

1, Polmlial Field for Induelrial AppUcaliom of Slalislml ^{dM, 

H. J. Hand, National Tube Company, Pittsburgh, Pa. 

2, Compulaliom for Analynt of Variance and FxperimenJat I)egiyn. 

Ben Epstein, Westinghouse Electric & Manufacturing Companv, Eaat Piu«. 
burgh, Pa. 

_ It ifl planned to hold these meetings bi-weekly, on fiaturelay afternoons for an 
indefinite period in the future. Topics to be considered in the aeries will inehide: 

1. Analyaia of variance and covariance. 

2. Desip of experiments. 

3. Tests of aignificanoe. 

4. Probability and probability distributions. 

6. Correlation and regression analysis, including the orthogonal coordinate method. 

0 . Tests of increased severity. 

7. Sampling theory, including stratification. 

8. Acceptance-rejection mathematica, Dodge sampling inspeclion lablw. 

B. ooewhart control chart techniques. 

10. Analysis of runs. 

11. Cycle analysis. 

12. Factor analysis. 


H. J. Hand, 

Secretdry-Tremrer, PMurgh Ckpi$r 



ANNUAL REPORT OF THE PRESIDENT OF THE INSTITUTE 


Continuing the establiehed tradition, the annual summer meeting was held at 
Wellesley, Massachusetts, August 12-13, 1944 in conjunction with the Summer 
Meetings of the American Mathematical Society and the Mathematical Associa¬ 
tion of America. A regional meeting was held in Washington, May 6-7, in 
conjunction with the meeting of the Washington Chapter of the American 
Statistical Association. The programs were arranged by the Program Com¬ 
mittee: W. Feller, Chairman, W. G. Madow, and A. Wald. 

Even though, under present war conditions, research in the field of probability 
and statistics is very much curtailed, enough papers in mathematical statistics 
of satisfactory quality have been proposed for publication in the Annals in 1944 
to keep the total volume of material at approximately five hundred pages or the 
level of the last few years. However, the outlook for a sufficient number of 
satisfactory papers to maintain the usual volume of publication during 1945 does 
not look quite so favorable. 

Looking into the future, the Institute must continue to furnish, through the 
Annals, a medium for the publication of all important results of original research 
in the field of mathematical statistics as they become available. To do otherwise 
would be suicide. At the same time we must take account of the growing need 
for comprehensive surveys of statistical theory on the part of other scientists, 
including not only social scientists but also physicists, chemists, biologists, and 
research engineers, whose interest in the contributions of mathematical statistics 
has been greatly stimulated during the war. Only the mathematical statiscian 
of broad competence can provide adequate critical surveys of this character. 
Perhaps some of this need can be met through survey articles published in the 
Annals, although it is not an easy matter to get capable men to do such work. 
Perhaps the time is not far off when the Institute must stimulate the preparation 
of such material by instituting an annual series of Colloquium Lectures patterned 
somewhat after those of the Mathematical Society, which could be published 
separately. 

This is but one of many problems that the Institute faces in its post-war 
development. Not only must it assume the responsibility of stimulating and 
encouraging research and of publishing the results; it must also consider the 
problem of training the research statistician of tomorrow as well as those who 
are to apply mathematical statistics in the many fields of science. It also must 
assume some responsibility for keeping in contact with other scientists in order 
that the mathematical statistician may become acquainted with the unsolved 
statistical problems of the scientist, There are also many problems of a pro¬ 
fessional character that face the mathematical statistician in the future if he is 
to succeed in developing the profession of mathematical statistics to the level 
attained by some of the older scientific professions. 

With the realization of the need for a concerted attack on some of these 


103 



104 


REPORT OF THE PRESIDENT 


problems, the Board of Directors at its meetinfj in May set up two committees, 
one on Training and Placement of Statisticians under Harold Hotelling and the 
other on Post-War Development of the Institute under W'. (5, Cochran. In- 
terim reports received by the Board from both committees indicate that consid¬ 
erable progress has been made to date. They also indicate., however, that much 
more woik remains to be done 

At the same meeting of the Board, a Budget and Finance (’ommittce was set 
up, consisting of P, S. Dwyer, Chairman, C. II. Fischer, A. (\ Olshcn, and C. F. 
Roos, to prepare a report on the policy that should he followwl liy the Institute 
in respect to such items as investment of funds, advertising, preparation of an 
annual statement, and the like. Some of the work of this committee has already 
borne fniit, as, for example, in providing the actuarial basis for life rnemberahip 
adopted at the Wellesley meeting and in establishing certain principles to be 
used in conducting the business of the Institute 

A report of the Committee on Membership, W. (t. Cochran, Chainnan, P, S. 
Dwyer, and T. Koopraans, appears elsewhere in this issue of the Amwls, Upon 
recommendation of this committee, the Board of Directors elected nine new 
fellows: Walter Bartky, C. I. Bliss, Gertrude M. Cox, P. A. Homt, M. G. Ken¬ 
dall, H. B, Mann, E, S. Pearson, Henry Scheff4, and W. A. Wallis. 

The nominating committee for the year consisted of John Curtiss, Chairman, 
E. G. Olds, and F. F. Stephan, G, W. Snedecor served the Institute again os its 
representative on the Council of the A.A.A.S. 

The annual election of the Institute just concluded by mail ballot resulted 
in the election of the following officers for 1945: W. E, Deming, President; W, G. 
Cochran, and J. L. Doob, Vice-Presidents. 


February 10,1945 


Walter A. Shewhart 
Pnndcnt, 19U 



ANNUAL REPORT OF THE SECRETARY-TREASURER 
OF THE INSTITUTE 

Accounts of the 1944 meetings of the Institute—the Wellesley meeting, the 
Washington regional meeting, and the Pittsburgh chapter meetings—-have ap¬ 
peared in appropriate issues of the Annals 

At the Wellesley meeting a number of amendments to tlie Constitution and 
By-Laws were passed These were published in the September, 1944, issue of 
the Annals. (The amended Constitution and By-Laws appear elsewhere in this 
issue.) 

Due to a large extent to the cooperation of the membership in sending in nom¬ 
inations, the Institute enjoyed a large increase in membership during the'year. 
There were some resignations and it was necessary to suspend fifteen persons at 
the end of 1944 because of failure to pay dues. It is apparent that, in some of 
these cases at least, our mail is not being received. Undoubtedly some of these 
memberships will be restored when contact is again established. As of January 
1, 1945, there were 606 members, a net gam of approximately one hundred 
members. 

During the year the Institute received gifts from Professor Harry Carver in 
the form of exchanges for early issues of the Annals, reprints of early articles, etc. 

The Secretary-Treasurer wishes to acknowledge the continued assistance of 
Professor Lloyd Knowler in looking after the back issues of the Annals which are 
stored at Iowa City. 

The following financial statement covers the period from December 22, 1943 
to December 31,1944 (the books and records of the Treasurer have been audited 
by Professor Thomas A Bickerstaff and were found to be m agreement with the 
statement as submitted): 

FINANCIAL STATEMENT 
December 22,1M3, to December 31, 1014 


Receipts 

Baeance on Hand, December 22, 1943 $3,716.05 

Dues 

1944 and before. $2,996,31 

1946 and 1946. 1,127.00 

Life. 330,00 


Subscriptions 

1944 and before. $1,301.94 

1945 and 1946.. . 883.94 


2,185.88 

Sale op Back Numbers . 1 385.02 

Miscellaneous . .... . gjg 


Total Receipts . $11,744.41 

105 










106 


HEPOBT OS' THE SECRETABT-THEABURER 


ExpenojtuHB 

Annals—Current 

Office of Editor. .. 1273.77 

Waverly Press .. ... .. 8,448.61 


3,722.28 

Annals—Back Numbers 

Purchase frota II, G. Carver... <149.40 

Iowa City Office,.. .. 06.26 


246.66 

Ofrice of Secretary^Treaburer 

Printing, mimeographing, programs, ate. (including stamped 


envelopes). . <377.00 

Pbstage and supplies. 68.02 

Clerical help.... 456.94 


Moving office from Pittsburgh. . 65.70 


Miscellaneous... 

Balance on Hand, December 31, 1944.. .. 


066.75 
29.07 
6,790 65 


»11,7>M.41 


No unpaid bills were m the hands of the Treanurer aa of DeccraW 31,1&44, 
and aside from an additional $100.(K) which the Board haa desiftnated for Annate 
expense for 1944, there were no large bills outstanding. 

Accounts receivable as of December 31, 1944, amounted to $3(X3.73. Many 
of these accounts are current accounts while some of the older ones are accounts 
with firms in India, which probably will be collected eventually. 

The American Library Association continued with its purchase of thirty seta 
of Volume XV of the Annate (for post war distribution) and the Universal Trad¬ 
ing Corporation (representbg the Chinese CSoverament) purchased twenty 
seta of Volumes 11-17 inclusive. Thaae orders contributed In no small way to 
the total 1944 income of $8,029.36. 

The 1944 balance $6,790.65 (consisting of bank balance of $3,790.65 and 
$3,000 00 in pvernment bonds) is $3,076.60 higher than it was on December 21, 
1943. This increase is due in part to 1944 business and in part to the fact that 
unusuaUy large payments toward future business, such as the $330.CK) in life 
payments and the $1,127.00 in 1946 and 1946 dues, have been made. 

To summarize the situation briefly, the Institute’s 1944 activity has resulted 
in a gain of approximately $1,600.00 and we are about this much in advance 
of our usual position with reference to the payments of follo^ving years, 


December 31,1944 


Paub S. Dwyeb 
Secretary-Trmurer, 











r OF THE MEMBERSHIP COMMITTEE OF THE INSTITUTE 


e duties of this Committee are not defined in detail in the Constitution, 
of Directors asked the Committee to prepare a statement describing 
priate composition and function of the Committee on Membership, 
resulted in the preparation of amendments to the Constitution and 
These amendments were passed at the business meeting at Wellesley 
I August 13, 1944, and are printed in full in the September, 1944, issue 
lals (p. 340). 

, the duties of the Committee are specified as follows in these amend- 

i Committee holds the power of election to the grades of Member and 
ember and makes recommendations to the Board of Directors with 
bo placing members in the other grades of membership, 
s the duty of the Committee to prepare and make available through 
tary-Treasurer an announcement of the qualifications necessary for 
nt grades of membership and to review these qualifications periodically, 
) Committee considers plana for increasing the number of applicants 
ership. 

litted by the amendments referred to above, the power of election to 
j of Member and Junior Member was delegated by the Committee in 
944, to the Secretary-Treasurer, subject to certain reservations. The 
of qualifications for the different grades of membership as mentioned 
)ve is published below. At the August 13 meeting of the Board of 
it was decided that no elections should be made at present to the grades 
-ry Member and Sustaining Member. 

recommendation of the Membership Committee the following members 
ed as Fellows by the Board of Directors: W. Bartky, C. I, Bliss, G. M. 
. Horst, M. G. Kendall, H. B. Mann, E. S. Pearson, H. Scheffg, W. A. 


Statement of Qualifications for the Different Grades of 
Membership in the Institute of 
Mathematical Statistics 

. The candidate shall either (a) be actively engaged in or show a 
merest in mathematical statistics, or (b) be interested in some applied 
tistics, with a desire to keep himself informed regarding recent develop- 
nathematical theory and techniques. 

Member. 

undergraduate student of a collegiate institution is eligible for election 
r Member of the Institute of Mathematical Statistics provided that he 
ponsored by a member of the Institute, 
annual dues ($2.50) must be submitted with the application. 

107 



108 


REPORT OF THE MEMBERSHIP COMMITTEE 


3. Anaual membership shall coincide with the calendar year and the Junior 
Member shall receive a complete volume of the Annals oj Malhmaiiail Stdislks 
for the year in which he or she is elected. 

4 Junior Membership shall be limited to a term of two years, but a Junior 
Member may apply for transfer to ordinary membership at the beginning of bis 
second year, 

Fellow. 

1. The candidate shall have evidenced continuing activity m reaearch in 
mathematical statistics by publication beyond his doctor’s dissertation of in- 
dependent work of merit. Normally two or three worthwhile papers beyond the 
dissertation will be required to establish this fact. 

2. The first qualification may be partly or wholly waived in the case of (a) 
a candidate of well-established leadership among mathematical statiaticians whose 
contributions to the development of the field of mathematical statistics other 
than sufficient published original research shall be judged of equal value or (b) 
a candidate of well-established leadership in the applications of mathematical 
statistics, whose work has contributed greatly to the utility of and the apprecia¬ 
tion for mathematical statistics. 

Honorary Member, A person of exceptional ability and acknowledged leader¬ 
ship in the field of mathematical statistics may be elected to the grade of Hon¬ 
orary Member by the Board of Directors, upon the recommendation of the 
Committee on Membership, 

Sustaining Member, The Board of Directors shall have the power to elect to 
Sustaining Membership any individual, group or corporation that is interested 
in furthering the purposes for which the Institute was formed. 

W. G. CoCHMN (C/wirman) 
W. E. Deminq 
P. 8. Dwyer 
T. Koopmans 


February 10,1945 



PROGRESS REPORT OF THE COMMITTEE ON POST-WAR 
DEVELOPMENT OF THE INSTITUTE 

In considering the post-war development of the Institute of Mathematical 
Statistics, the Committee has recognized two general problems: 

A, The problem of what additional activities the Institute should undertake 
in order to provide further stimulus to the development of the field of 
mathematical statistics. 

B. The problem of determining how the Institute can cimperate more effec¬ 
tively ivith the users of statistical techniques. 

Because of rapidly increasing interest in the application of statistical methods 
in many different fields, the Committee has directed most of its attention thus 
far to Problem B; the present progress report is concerned with the work of the 
Committee on this problem. The Committee hopes to submit a report on 
Problem A at the end of 1945. 

With respect to Problem B, it is the opinion of the Committee that a central 
organization for the statistical societies should be of common interest. Accord¬ 
ingly, a plan was worked out and submitted to the Board of Directors of the 
Institute at the Wellesley meeting of the Institute. This proposal and its 
present status are discussed below. 

We believe that there is much to be gained from an organization that would 
form a link between the various statistical societies, and would have the following 
principal aims: 

(1) To represent the members of the societies in all matters of common intnrcHt. 

(2) To promote cooperation between statisticians working in the different 
fields of application, and between mathematical statistics, applied statis¬ 
tics, scientific research and the industries. 

(3) To develop amongst the public an appreciation of the value of the statisti¬ 
cal method in scientific inquiry. 

It is our opinion that an organization similar to that of the Institute of Physiw 
would be suitable The statistical societies, Avhile retaining their present auton¬ 
omies, would become founding members of a corporation whose governing 
board would contain representatives from each society. In pursuance of its aims 
as outlined above, the new organization might; 

(a) Take the lead in formulating policies on (luestions which concern all 
statisticians. 

(b) Publish a journal of general interest to statisticians and undertake the 
routine work m connection with the publication of the journals of the 
individual societies, the societies retaining in full their present responsi¬ 
bility for the contents of their journals. 

(c) Arrange joint meetings between different .statistical socictic.s and between 
statistical and other scientific .societies. 


109 



110 


PROGRESS REPORT 


(d) Assist new groups in organizing for their benefit, either under the auspices 
of one of the present societies or in a new society, which might at first be 
given associate membership and later full membership of the central 
organization. 

(e) Take steps to bring news about the use of statistics in scientific research 
to the attention of the public and more particularly of leaders in industry, 
in federal, state and local agencies and in education. 

(f) Investigate the demands for various types and degrees of statistical 
training, outline courses of training in statistics suitable for meeting these 
demands and make strenuous efforts to have the recommended courses 
of training put into effect, in order that statisticians can be of fullest 
service in the nation's work. In this connection an information and 
placement bureau may be an appropriate auxiliary. 

(g) Institute an abstracting service in statistical methodology. This might 
take the form of a periodical publication of abstracts of papers with respect 
to their methodological content rather than their subject matter. The 
coverage would include joumala of busing, marketing, engineering, 
medicine and agriculture as well as purely statistical publications. 

The financial needs of the new organization, which would maintain a paid 
full-time staff, may be met initially by contributions from the present societies. 
In view of the extra services which would be rendered to statisticians, some 
increase in the subscription rates of the present societies appears reasonable. A 
member who belongs to more than one of the present societies would pay the 
extra amount only once, Supplementary income might be derived from ad¬ 
vertising in the journal of the central organization and from the establishment 
of sustaining or corporate memberships in the central organization. 

At the time of the Wellesley meeting of the Board, there had been only in¬ 
formal contacts between members of this Committee and members of other 
statistical societies. We considered it our first task to obtain some consensus of 
opinion from the standpoint of the Institute of Mathematical Statistics. Fol¬ 
lowing general approval by the Board of Directors of the Institute, members of 
the Committee discussed the proposal for a central organization with representa¬ 
tives of several other statistical societies. The American Statistical Association 
has a Committee to consider the future structure of the Association and this 
Committee brought the Institute proposal before the Board of Directors of the 
Association for action. As the oldest of the statistical societies, the American 
Statistical Association then invited participation in an intersociety committee 
by the Institute and nine other societies or sections, directly or indirectly con¬ 
cerned with statistical method. This committee is to explore the possibilities of 
coordinating the activities of the several statistical societies and report its 
recommendations back to each organization. The repr^entatives have now 
been named and the first meeting was held on February 10,1946, in New York. 
At this meeting the Institute was represented by W. G. Cochran and Lt. John 
H. Curtiss. 



PROGRESS REPORT 


111 


With regard to the problem of what additional activities the Institute should 
undertake in order to furnish additional stimulation to the development of the 
field of mathematical statistics, the Committee has discussed several ideas which 
appear promising. It is hoped to present a complete report on this phase of the 
Committee's work at the end of this year 

C. I. Buss 

W. G. Cochran (Chairman) 
W. E. Deming 
P. S. OliMSTEAD 
S, S. Wilks 


February 12, 1945 



CONSTITUTION 
OF THE 

INSTITUTE OF MATHEMATICAL STATISTICS 

ARTICLE I 
Name Am Purpose 

1. This organization shall be known aa the Institute of Mathematical Statistics. 

2. Its object shall be to promote the interests of mathematical statistics. 

ARTICLE II 

MEMBEHaSIP 

1. The membership of the Institute shall consist of Members, Junior Members, Fellows, 
Honorary Members, and Sustaining Members. 

2. Voting members of the Institute shall be (a) the Fellows, and (b) all others, Junior 
Members excepted, who have been members for twenty-three months prior to the date of 
voting. 

3. No person shall be a Junior Member of the Institute for more than a limited terra as 
determined by the Committee on Membership and approved by the Board of Directors. 

ARTICLE III 

Officers, Boaiu) op Directobs, and Committee on Membership 

1. The Officers of the Institute shall be a President, two Vice-Presidents, and a Secre- 
taiy-Treasurer. The terms of office of the President and Vice-Presidents shall be one year 
and that of the Secretary-Treasurer three years. Elections shall be by majority ballots at 
Annual Meetings of the Institute. Voting may be in person or by mail. 

(a) Exception. The first group of Officers shall be elected by a majority vote of the in¬ 
dividuals present at the organization meeting, and shall serve until December 31, 1938. 

2. The Board of Directors of the Institute shall consist of the Officers, the two previous 
Presidents, and the Editor of the Official Journal of the Institute. 

3 The Institute shall have a Committee on Membership composed of a Chairman and 
three Fellows. At their first meeting subsequent to the Adoption of this Constitution, the 
Board of Directors shall elect three members as Fellows to serve os the Committee on 
Membership, one member of the Committee for a term of one year, another for a term of 
two years, and another for a term of three years. Thereafter the Board of Directors shall 
elect from among the Fellows one member annually at their first meeting after their elec¬ 
tion for a term of three years. The president shall designate one of the Vice-Presidents as 
Chairman of this Committee. 

ARTICLE IV 
Mxbtinob 

1. A meeting for the presentation and disousaion of papers, for the election of Officers, 
and for the transaction of other business of the Institute shall be held armually at such 
time as the Board of Directors may designate. Additional meetings may be called from 

112 



INSTITUTE OF MATHEMATICAL STATISTICS 


113 


time to time by the Board of Directora and shall be called at any time by the President 
upon written request from ten Fellows. Notice of the time and place of meeting shall be 
given to the membership by the Secretary-Treasurer at least thirty days prior to the date 
set for the meeting. All meetings except executive sessions shall be open to the public 
Only papers accepted by a Program Conumttee appointed by the President may be pre¬ 
sented to the Institute. 

2 The Board of Directors shall hold a meeting immediately after their election and 
again immediately before the expiration of their term. Other meetings of the Board may 
be held from time to time at the call of the President or any two members of the Board. 
Notice of each meeting of the Board, other than the two regular meetings, together with a 
statement of the business to be brought before the meeting, must be given to the members 
of the Board by the Secretary-Treasurer at least five days prior to the date set therefor. 
Should other business be passed upon, any member of the Board shall have the right to 
reopen the question at the next meeting. 

3, Meetings of the Committee on Membership may be held from time to time at the c-all 
of the Chairman or any member of the Committee provided notice of such call and the 
purpose of the meeting is given to the members of the Committee by the Secretary- 
Treasurer at least five days before the date set therefor. Should other business be passed 
upon, any member of the Committee shall have tlie right to reopen the question at the 
next meetmg. Committee business may also be transacted by correspondence it that 
seems preferable. 

4. At a regularly convened meeting of the Board of Directors, four members shall con¬ 
stitute a quorum. At a regularly convened meeting of the Committee on Membership, 
two members shall constitute a quorum. 

ARTICLE V 

PunUCATIONB 

1. The AnruUs of Afathemaltcal ShUuttes ahall be the Official Journal for the Institute. 
The Editor of the AnnaU of Mathemalxeal Staiialics shall be a Fellow appointed by the 
Board of Directors of the Institute. The term of office of the Editor may be terminated at 
the discretion of the Board of Directors. 

2, Other publications may be originated by the Board of Directors as occasion arises. 

ARTICLE VI 
ExpuxeioN OB Suspension 

1. Except for non-payment of dues, no one shall be expelled or suspended except by 
action of the Board of Direotois with not more than one negative vote. 

ARTICLE VII 
Amendments 

1. This constitution may be amended by an affirmative two-thirds vote at any regularly 
convened meetmg of the Institute provided notice of such proposed amendment shall have 
been sent to each voting member by the Secretary-Treasurer at least thirty days before the 
date of the meeting at which the proposal is to be acted upon. Voting may be in person or 
by maiL 



114 


BT-IA-WS 


BY-LAWS 
ARTICLE I 

DCTIBB OFTHK OmCERS, THK EpITOR, BoA-RD OF DlRECTOHa, AKP CoHMITTEB ON MEM¬ 
BERSHIP 

1. The President, or in his absence, one of the Yice-Pre«icieiit«, or in the absence of the 
President and both Vico-Preefdenta, a Fellow selected by vote of the Fellows present, shall 
preside at the meoUngs of tlie Institute and of the Board of Directors. At mee,tings of the 
Institute, the presiding officer shall vote only in the case of a tic, but at mcetinp of the 
Board of Directors he may vote in all cases. At least three months l>efore the date o f the 
annual meeting, the President shall appoint a Nominating Committee of three members. 
It shall be the duty of the Nominating Committee to make nominations for Officers to be 
elected at the annual meeting end the Secretary-Treasurer shall notify all voting members 
at least thirty days before the annual meeting. Additional nominattons may be sub¬ 
mitted in writing, if signed by at least ten Fellows of the Institute, up to the time of the 
meeting. 

2. The Secretary-Treasurer shall keep a full and accurate record of the proceeding at 
the meetings of the Institute and of the Board of Directo rs, send out calls for said meetinp 
and, with the approval of the President and the Board, carry on the correspondence of the 
Institute. Subject to the direction of the Board, he shall have charge of the arcliives and 
other tangible and mtangible property of the Institute, and once a year he shall publish in 
the AnnaU of Mathmatied SUUulia a classified list of all Members and Fellows of the 
Institute. He shall send out calls for annual du«j and acknowledge receipt of same; pay 
all bills approved by the President for expenditures authorised by the Board or the Insti¬ 
tute; keep a detailed account of all receipts and expenditures, prepare a financial statement 
at the end of each year and present an abstract of the same at the annual meeting of the 
Institute after it has been audited by a Member or Fellow of the Institute appointed by the 
President as Auditor. The Auditor shall report to the President. 

3. Subject to the direction of the Board, the Editor shall be charged with the r^ponsi- 
biUty for all editorial matters concerning the etBting of the Anrmh of Mathomatical Sia- 
litlia. He shall, with the advice and consent of the Board, appoint an Editorial Commit¬ 
tee of not less than twelve members to co-operate with him; four for a period of five years, 
four for a period of three years, and the remaining members for a period of two years, ap¬ 
pointments to be made annually as needed. All appointments to the Editorial Com¬ 
mittee shall terminate with the appointment of a new Editor. The Editor shall serve as 
editorial adviser in the publication of all scientific monographs and pamphlets authorised 
by the Board, 

4. The Board of Directors shall have charge of the funds and of the affairs of the In¬ 
stitute, with the exception of those affairs specifically assigned to the President or to the 
Committee on Membership. The Board s^ll have authority to fill all vacancies ad in¬ 
terim, occurring among the Officers, Board of Directors, or in any of the CommitteiM. The 
Board may appoint such other committees as may be required from time to time to carry 
on the affairs of the Institute. The power of election to the different grades of Member¬ 
ship, except the grades of Member and Junior Member, shall reside in the Board. 

6. The Committee on Membership shall prepare and make available through the 
Secretary-Treasurer an announcement indicating the qualifications requisite for the differ- 



BY-LAWB 


115 


ent grades of membership. The Committee shall review these qualifications {Kiriwiically 
and shall make such changes in these qualifications and make such rccommendaticms witii 
reference to the number of grades of membership as it deems advisable. Tlie [Kiwer to 
elect worthy applicants to the grades of Member and Junior Member shall residn in the 
Committee, which may delegate this power to the Seeretary-Trensurer, subject to such 
reservations as the Committee considers appropriate. The Committee shall make reroin- 
mendations to the Board of Directors with reference to placing mcmliers in other grarles 
of membership. The Committee shall give its attention to the quCvStion of increasing the 
number of applicants for membership and shall advise the Secretary-Treasurer on plans 
for that purpose. 


ARTICLE II 
Dues 

1. Members shall pay five dollars at the time of admission to membership and shall receive 
the full current volume of the Official Journal. Thereafter, Members shall pay five dol¬ 
lars annual dues. The annual dues of Junior Members shall be two dollars and fifty cents. 

The annual dues of Fellows shall be five dollars. The annual dues of Sustaining Members 
shall be fifty dollars. Honorary Members shall be exempt from all dues. 

(a) Exception In the case that two Members of the Institute are husband and wife 
and they elect to receive between them only one copy of the Official Journal, the annual 
dues of each shall be three dollars and seventy-five cents. 

(b) Exception. Any Member or Felloiv may moke a single payment which will lie 
accepted by the Institute in place of all succeeding yearly dues and which will not otherwise 
alter his status as a Member or Fellow. The amount of this payment will dciiend ujxm 
the age of this Member or Follow and will be based upon a suitable table and rate of inter¬ 
est, to be specified by the Board of Directors. 

(c) Exception. Any Member or Junior Member of the Institute serving, except ns a 
commissioned officer, in the Armed Forces of the United States or of one of its allies, may 
upon notification to the Secretary-Treasurer be excused from the payment of dues until the 
January first following his discharge from the Service. He shall have all privileges of 
membership except that he shall not receive the Official Journal. However during the first 
year of his resumed regular membership he may have the right to purchase, at S2.50 iier 
volume, one copy of each volume of the Official Journal published during the period of his 
service membership. 

2. Annual dues shall be payable on the first day of January of each year. 

3. The annual dues of a Fellow, Member, or Junior Member include a subscription to the 
Official Journal. The annual dues of a Sustaining Member include two subscriptions to 
the Official Journal. 

4. It shall be the duty of the Secretary-Treasurer to notify by mail anyone whose dues 
may be six months in arrears, and to accompany such notice by a copy of this Article. If 
such person fail to pay such dues within three months from the date of mailing such notice, 
the Secretary-Treasurer shall report the delinquent one to the Board of Directors, by whom 
the person’s name may be stricken from the rolls and all privileges of memberahip with¬ 
drawn. Such person may, however, be re-instated by the Board of Directors upon pay¬ 
ment of the arrears of dues. 



IIG 


BY-LAWS 


ARTICLE III 
Salaries 

1. The Institute shall not pay a salary to any Officer, Director, or member of any com¬ 
mittee, 

ARTICLE IV 
Amendments 

1. These By-Laws may be amended in the same manner as the Constitution or by a 
majority vote at any regularly convened meeting of the Institute, if the proposed amend¬ 
ment has been previously approved by the Board of Directors. 



SEQUENTIAL TESTS OF STATISTICAL HYPOTHESES 
By a. Wald 

Columbia University 

Table of Contents 

P«tge 


A. Introduction , . .. 118 

B. Historical note . .119 


Pari I Sequential test of a simple hypothesis against a single alternative 

1 The current test procedure.122 

2. The sequential test procedure, general definitions. 123 

2 1 Notion of a sequential Test, 2 2. Efficiency of a sequential teat 2 3. 
Efficiency of the current procedure, viewed as a particular case of a sequential 
teat 

3. Sequential piobability ratio test. ...125 

3 1 Definition of the sequential probability ratio test. 3 2 Fundamental 
relations among the quantities a, 0, A and B. 3 3. Deterniimitum of the valuea 
A and B in practice. 3.4 Probability of accepting Ho (or H,) when Honu' third 
hypothesis H is true 3.5. Calculation of 6 and rj for binomial and iinrninl 
distributions, 

4. The number of observations required by the sequential probaliility ratio 

test . 1.12 

4 1 E.\pectod number of obsoivations necessary for reaching a deei,sion 4 2 
Calculation of the quantities { and for binomial and normal dislrilmlinriH 
4 3 Saving HI the number of observations a.s compared «i(h the current test pro¬ 
cedure, 4.4 The characteristic function, the momenis and the distribution of 
the number of observations necessaiy for reaclung a decision. 4 5. Lower limit 
of the probability that the sequential process will terminate with a nuniher of 
trials less than or equal to a given number 4 6 Truncated sequential analysis 

4 7 Efficiency of the sequential probability ratio test. 

Pari II Sequential test of a simple or composite hijpoikrsis against a set of 

alternatives 

5. Test of a simple hypothe.si.s against oiio-hidod iiUvrnative.s. . , IfiK 

5 1 General Ileniaiks 5,2 .Application to lunoiuial dlslrdiutiiniH. ,5.3 Se 
queiitial analysis of double dichotomies 5,1. .Application to testing the menu 
of a normal distribution with known Blandard deviation. 

6 Oiitlmo of a general theory of sequential (est.s of liyimlheses when no re¬ 
strictions are impo,sed on the alternative valne.s of llie unknown 
parameters . , . lyij 

6 1. Sequential test of a simple hypothesis with no ipstrietion.*! on the allerna 
tive values of the unknown parameters. 0 2 idequentitil le.st of a eompusiie 
hypothesis. 


117 










118 


A. WALD 


A. Introduction 

By a sequential test of a statistical hypothesis, is meant any statistical teat 
procedure which gives a specific rule, at any stage of the experiment (at the 
n-th trial for each integral value of n), for making one of the following three 
decisions: (1) to accept the hypothesis being tc.sted (null hypothesis), (2) to 
reject the null hypothesis, (3) to continue the extK'rimcnt by making an addi¬ 
tional observation. Thus, such a teat procedure ia carried out sequentially. 
On the basis of the first trial, one of the three decisions mentioned above is made. 
If the first or the second decision is made, the process is terminated. If the 
third decision is made, a second trial is performed. Again on the basis of the 
first two trials one of the three decisions is made and if the third decision is 
reached a third trial is performed, etc. This process is continued until either 
the first or the second decision is made. 

An essential feature of the sequential test, as distinguished from the current 
test procedure, is that the number of observations required by the sequential 
test is not predetermined, but is a random variable due to the fact that at any 
stage of the experiment the decision of terminating the process deixjnds on the 
results of the observations previously made. The current te.st procedure may 
be considered a limiting case of a sequential test in the following sense: For any 
positive integer n less than some fixed positive integer N, the third decision is 
always taken at the n-th trial irrespective of the results of these first n trials. 
At the N-ih. trial either the first or the second decision is taken. Which decision 
is taken will depend, of course, on the results of the N trials. 

In a sequential test, as well as in the current test procedure, \vc may commit 
two kinds of errors. We may reject the null hypothesis when it is true (error 
of the first kind), or we may accept the null hypothesis when some alternative 
hypothesis is true (error of the second kind). Suppose that we wish to test the 
null hypothesis Ht, against a single alternative hypothesis J/i, and*that we want 
the test procedure to be such that the probability of making an error of the 
first kind (rejecting Ho when Ho is true) does not exceed a preassigned value a, 
and the probability of making an error of the second kind (accepting Ho when 
Hi is true) does not exceed a preassigned value p. Using the current teat pro¬ 
cedure, i.e., a most powerful teat for testing Ho against Hi in the sense of the 
Neyraan-Pearson theory, the minimum number of observations required by the 
test can be determined as follows: For any given number N of observations a 
most powerful test is considered for which the probability of an error of the first 
kind is equal to «. Let /9(H) denote the probability of an error of the second 
kind for this test procedure. Then the minimum number of observations is 
equal to the smallest positive integer N for which /3(1V) < 

In this paper a particular test procedure, called the sequential probability 
ratio test, is devised and shown to have certain optimum properties (see section 
4.7). The sequential probability ratio test in general requires an expected num¬ 
ber of observations considerably smaller than the fixed number of observations 
needed by the current most powerful test which controls the errors of the first 



SEQUENTIAL TESTS 


119 


and second kinds to exactly the same extent (has the same a and /3) as the se¬ 
quential test. The sequential probability ratio test frequently results in a 
saving of about 60% in the number of observations as compared with the cur¬ 
rent most powerful test. Another surprising feature of the sequential prob¬ 
ability ratio test is that the test can be carried out without determining any 
probability distributions whatsoever. In the current procedure the teat can be 
carried out only if the probability distribution of the statistic on which the test 
is based is known. This is not necessary in the application of the sequential 
probability ratio test, and only simple algebraic operations are needed for carry¬ 
ing it out. Distribution problems arise in connection with the sequential prob¬ 
ability ratio test only if we want to make statements about the probability dis¬ 
tribution of the number of observations required by the test, 

This paper consists of two parts. Part I deals with the theory of sequential 
tests for testing a simple hypothesis against a single alternative. In Part II a 
theory of sequential tests for testing simple or composite hypotheses against 
infinite sets of alternatives is outlined. The extension of the probability ratio 
test to the case of testing a simple hypothesis against a set of one-sided alterna¬ 
tives is straight forward and does not present any difficulty. Applications to 
testing the means of binomial and normal distributions, as well as to testing 
double dichotomies are given. The theory of sequential tests of hypothcw.s 
with no restrictions on the possible values of the unknown parameters is, how¬ 
ever, not as simple. There are several unsolved problems in this case and it is 
hoped that the general ideas outlined in Part II ivill stimulate further research. 

Sections 5.2, 5.3 and 5.4 in Part II deal with the applications of the sequential 
probability ratio teat to binomial distributions, double dichotoinie.s and normal 
distributions. These sections are nearly self-contained anil can be understood 
without reading the rest of the paper Thus, reader.s who are primarily in¬ 
terested in these special cases of the sequential probability ratio te.st rather than 
m the general theory, may profitably read only the above mentioned sections 
For the benefit of readers who lack a sufficient background in the mathematical 
theory of statistics the exposition in sections 5.2, 5.3 and 5 4 is kept on a fairly 
elementary level. 

It should be pointed sut that whenever the number of observations on which 
the test is based is for some reason determined in advance, for instance, if certain 
data are available from past history and no additional data can be obtained, then 
the current most powerful test procedure is preferable. The Huperiority of the 
sequential probability ratio test is due to the fact that it requires a smaller ex¬ 
pected number of observations than the current most pinverful test. This 
feature of the sequential probability ratio test is, however, of no value if the num¬ 
ber of observations is for some reason determined in advance. 

B, Historical Note 

To the best of the author's knowledge the first idea of a secpicntial test, i.e., 
a test where the number of observations is not predetermined but is. dependent 



120 


A. WAI/D 


On the outcome of the observations, goes back to II. B’. Dotlge and H. (», Romig 
who proposed a double .sampling in.si)ectioii procedure [1]. In thi.s double .Muiip- 
ling scheme the decision whether a second sample should be drawn or nut de¬ 
pends on the outcome of the observations in the fir.st sample. The reason for 
introducing a double sampling method was, of course, the recognition of the fact 
that double sampling re.sults in a reduction of the amount of in.si)eetitm lus com¬ 
pared with "single” sampling. 

The double sampling method does not fully take advantage of secpiential 
analysis, since it does not allow for more than two .sample.^. A multiple* .sampling 
scheme for the particular case of testing the mean of a binomial di.stributiou was 
proposed and discussed hy Walter Rartky [2). His procedure i.s closely related 
to the test which results fiom the application of the seciuential prolmbility ratio 
test to testing the mean of a binomial dfetribution. Bartky clearly recognised 
the fact that multiple sampling results in a considerable reduction of the average 
amount of inspection. 

The idea of chain experiments discussed briefly liy Harold Hotelling (,1) i.s also 
somewhat related to our notion of sequential analysi.s. An interc.sting example 
of such a chain of experiments is the series of sample I'en-suses of area of jute in 
Bengal carried out under the direction of P. (1 Mahalanobis [G|. The succes¬ 
sive preliminary censuses, steadily increasing in size, were primarily cle.signed to 
obtain some information as to the parameter.^ to be e.Htimated so that an eflieient 
design could be set up for the final sampling of the whole immensi* jute area in 
the province. 

In March 1943, the problem of sequential analy.sis arose in the Statistical 
Research Group, Columbia Universityin connection with a .si>eci(ie cmeation 
posed by Captain G. L. Schuyler of the Bureau of Ordnance*, Navy Department. 
It was pointed out by Milton Friedman and W Allen Wallis that the mere notion 
of sequential analysis could slightly improve the efficiency of .some current most 
powerful testa. This can be seen os follows: Suppose that N is the planned 
number of tnals and IFy is a most powerful critical region based on N oliserva- 
tions. If it happens that on the basis of the first n trials (ri < N) it is already 
certain that the completed set of N trials must lead to a rejection of the null 
hypothesis, we can terminate the experiment at the n-th trial and thus save some 
observations. For instance, if Wn is defined by the inequality x! -p . . + Xy > c, 
and if for some n < IV we find that -f .. . -b x,’ > c, we can terminate the 
process at this stage. Realization of this naturally led Friedman and Wallis to 
the conjecture that modifications of current tests may exist which take advantage 
of sequential procedure and effect substantial improvements. More specifically, 
Friedman and Wallis conjectured that a sequential teat may exist that controls 
the errors of the first and second kinds to exactly the same extent a.s the current 

1 The Statistical Research Group operates under a contract with the Office of Scientific 
Research and Development and is directed by the Applied Mathematics Panel of the 
National Defense Research Committee 



SEQUENTIAL TESTS 


121 


most powerful test, and at the same time requires an expected number of observa¬ 
tions substantially smaller than the number of observations required by the 
current most powerful test,* 

It was at this stage that the problem was called to the attention of the author 
of the present paper. Since infinitely many sequential test procedures exist, 
the first and basic problem was, of course, to find the particular sequential test 
procedure which is most eflicient, i.e , which effects the greatest possible saving 
in the expected number of observations as compared with any other (sequential 
or non-sequential) test. In April, 1943 the author devised such a test, called 
the sequential probability ratio test, which for all piactical purposes is most 
efficient when used for testing a simple hypothesis Ha against a single alterna¬ 
tive Hi . 

Because of the substantial savings in the expected number of observations 
effected by the sequential probability ratio test, and because of the simplicity 
of this test procedure in practical applications, the National Defense Research 
Committee considered these developments sufficiently useful for the war effort 
to make it desirable to keep the results out of the reach of the enemy, at least for 
a certain period of time. The author was, therefore, requested to submit his 
findings in a restricted report [7] which was dated September, 1943.’ In this 
report the sequential probability ratio test is devised and its mathematical theory 
is developed. In July 1944 a second report [8] was issued by the Statistical 
Research Group which gives an elementary non-mathematical exposition of 
the applications of the sequential probability ratio test, together ivith chart,s, 
tables and computational simplifications to facilitate applications. 

Independently of the developments heie, G. A. Barnard [9] recognized tlie 
merits of a sequential method of testing, i.e., the possibility of a saving in the 
number of observations as compared with the current most powerful test. lie 
also devised an interesting sequential test for testing double dichotomies, which 
differs from the one obtained by applying the sequential probability ratio tc.st. 

Some further developments in the theory of the sequential probability ratio 
test took place in 1944. Extending the methods used in [7], C. M. Stockman 
[10] found the operating characteristic curve of the sequential probability ratio 
test applied to a binomial distribution. Independently of Stockman, Milton 
Friedman and George W. Brown (independently of each other) obtained the 
same result which can be extended to the normal distribution and a few other 
specific distributions, but is not applicable to more general distributions, The 
general operating characteristic curve for any sequential probability ratio te,Ht 
was derived by the author [11], A few months later the author dev'cloptal a 
general theory of cumulative sums [4] which gives not only the operating char- 

* Bartky s multiple sampling scheme [2] for testing the mean of a binomial distribution 
provides, of course, an example of such a sequential test (see, for example, the remarks on 
p 377 m [2]). Bartky’s results were not known to us at that time, since they were publislied 
nearly a year later, 

’ The material was recently released making the present publication possible 



122 


A. WALD 


acteristic curve for any sequential probability ratio test but also the character¬ 
istic function of the number of observations required by the test. 

The theory of the sequential probability ratio test as given in the present 
paper differs considerably from the exposition given in [7], since the new de¬ 
velopments in [4] have been taken into account. However, some tables and a 
few sections of the original report [7] are included in the present paper ivithout 
any substantial changes. 

Part I. Sequential Test of a Simple Hypothesib Against a 
Single Alternative 

1. The Current Test Procedure 

Let X be a random variable. In what follows in this and the subsequent 
sections it will be assumed that the random variable X has either a continuous 
probability density function or a discrete distribution. Accordingly, by the 
probability distribution /(i) of a random variable X we shall mean either the 
probability density function of X or the probability that X = x, depending upon 
whether X is a continuous or a discrete variable. Let the hypothesis Ho to be 
tested (null hypothesis) be the statement that the distribution of X is foix). 
Suppose that Ho is to be tested against the single alternative hypothesis Ih that 
the distribution of X is given by /i(x). 

According to the Neyman-Pcarson theory of testing hypotheses a most imwer- 
ful critical region Wi/ for testing Ih against on the basis of N independent 
observations Xi, ■ • • , Xjv on A” is given by the set of all sample points (ii, ■ • • , 
Xk) for which the inequality 

,, /l(Xi)/i(Xi) ••• 

^ ^ ^ foixi)fo{xo) ... Mx,) “ 

is fulfilled. The quantity k on the right hand side of (1.1) is a constant and is 
chosen so that the size of the critical region, i.e., the probability of an error of 
the first kind should have the required value a. 

Por a fixed sample size N the probability 0 of an error of the second kind is a 
single valued function of a, say /3w(“)i if a most powerful critical region is used. 
Thus, if in addition to fixing the value of a it is required that the probability of 
an error of the second kind should have a preassigned value P, or at least it should 
not exceed a preassigned value p, we are no longer free to choose the sample size 
N, The minimum number of observations required by the test satisfying these 
conditions is equal to the smallest integral value of N for which pNi<x) < p. 

Thus, the current most powerful test procedure for testing //o against Ih can 
be briefly stated as follows: We choose as critical region the region defined by 
(1.1) where the constant k is determined so that the probability of an error of 
the first kind should have a preassigned value a and N is equal to the smallest 
integer for which the probability of an error of the second kind does not exceed 
a preassigned value p. 



SEQUENTIAL TESTS 


123 


2. The Sequential Test Procedure: General Definitions 

2.1. Notion of a sequential test. In current tests of hypotheses the number of 
observations is treated as a constant for any particular problem. In sc(iuential 
tests the number of observations is no longer a constant, but a random variable 
In what follows the symbol n is used for the number of observations required by 
a sequential test and the symbol N is used when the number of observations i.s 
treated as a constant. 

Sequential tests can be described as follows' For each positive integer m the 
m-dimensional sample space Mm is subdivided into three mutually exclusive 
parts Rm, RL and Rm . After the first observation Xi has been drawn Ha is 
accepted if Xj. lies in iB?, Ho is rejected {ie , Hi is accepted) if xi lies in , or a 
second observation is drawn if xi lies in iJj. If the third decision is reached and 
a second observation Xi drawn, Ha is accepted, Hi is accepted, or a third observa¬ 
tion is drawn according as the point (xi, xf) lies in Rl , R\ or m Ri. If (xi, Xj) 
lies inRi, a third observation xa is drawn and one of the three decisions is made 
according as (xi, xa , X 3 ) lies in Rl, R\ or in R 3 , etc. This process is stopped 
when, and only when, either the first decision or the second decision is reached. 
Let n be the number of observations at which the process is terminated. Then 
n is a random variable, since the value of n depends on the outcome of the 
observations. (It will be seen later that the probability is one that the .sc(iuential 
process will be terminated at some finite stage.) 

We shall denote by Eo{n) the expected value of n if Ha is true and by Ei{n) 
the expected value of n if Hi is true. These expected value.s, of course, depend 
on the sequential test used. In order to put this dependence in evidence, we 
shall occasionally use the symbols Eain | S) and Ei(n \ S) to denote the value.s 
Eo{n) and Ei{n), respectively, when the sequential test S is applied. 

2.2. Efficiency of a sequential test. As in the current test procedure, errors of 
two kinds may be committed in sequential analysis. We may reject Ho when 
it is true (error of the first kind), or we may accept Ha when Hi is true (error of 
the second kind). With any sequential test there will be associated two num¬ 
bers a and /3 between 0 and 1 such that if Ha is true the probability is a that we 
shall commit an error of the first kind and if Hi is true, the probability is (3 that 
we shall commit an error of the second kind. We shall say that two sequential 
tests )S and S' are of equal strength if the values a and 13 associated with S are 
equal to the corresponding values a' and j3' associated with S'. If a < a' and 
d < or if a < a' and /3 < we shall say that S is stronger than S'{S' is 
weaker than S). li a > a' and (3 < /3'i or if a < a' and ^ > ft', we .shall .say 
that the strength of S is not comparable with that of S'. 

Restricting ourselves to sequential tests of a given strength, wi'^ want to make 
the number of observations necessary for reaching a final deci.sion as small as 
possible. If S and S' are two sequential tests of equal strength we .shall say 
that S' is better than S if either Ba{n | S') < Eain 1 S) and E\in j S') < Ei 
(n I (S), or Eain | S') < £Jo(n | S) and Eiin ] S') < Eiin j S). A sequential test 
will be said to be an admissible test if no better test of equal .strength e.xists. 



124 


A. WAI-D 


If a sequential test S satisfies both inequalities fi’o(n | <S’) < A’(i(n } iS") and Ei 
(n 1 S) < Ei{n I S') for any sequential test S' of strength ecivial to that of S, then 
the test S can he considered to be a best sequential test. That such te.sts exi.st, 
i.e., that it is possible to minimize E(,{n) and Ei(,n) simultaneously, Is not proved 
here; but it is shown later (section 4.7) that for the so called .seipiential proli- 
ability ratio test defined in section 3.1 both 7?o(a) and A'lfn) are very nearly 
minimized.'* Thus, for all practical purposes the .sequential probability ratio 
test can be considered best. 

Since it is unknown that a sequential test always exists for which both Eo{n) 
and i?i(n) are exactly minimized, we need a substitute definition of an optimum 
test. Several substitute definitions arc possible. We could, for example, re¬ 
quire that the test he admissible and the maximum of the two values Eoin) and 

Ea{n) + 

2 ' 


Ei{n) be minimized, or that the mean -■ - - 


or some other weighted 


average be minimized All lho.se definitions are equivalent if a .secpiential tc.st 
exists for wliich both Eoin) and /i’i(n) are minimized; l)Ut if th(fy caniint h(‘ mini¬ 
mized simultaneously the definitions differ. Which of tlu'm is chosen i.s of no 
significance for the purpose of this paper, .since for tlie s(>t|qentiid probability 
ratio te.st propo.sed later lioth expected valuc.s /^o(n) and Ififn; are, if nol (>xactly, 
very nearly minimized. If we liad a piiori knowledge as to how frequently IIo 
and how fi'oquently Hi uill be true in the long run, it would he mo.st n^asoimblR 
to minimize a weighted average (weighted by the fvequi'ncics of Ho imd Hi, 
respectively) of Eo{n) and Ei{n). However, when Mich knowledge i.s iib.sent, 
as is usually the case in practical aiiplieations, it i.s perhaps mor(‘ rea.sonable to 
minimize the maximum of Eo(n) and Ei{n) tlian to minimize some W( ighted 
average of i?o(n) and Ei{n). Hence the following definition is introduced. 

A sequential test S is said to be an optimum test if S is admis-sible and Max 
[£o(n I S), Ei{n I »S)] < Max [Eo{n \ S'), Ei{n \ Ti')) for all set|uential tests S' of 
strength equal to that of S, 

By the efficiency of a sequential test S is meant tlie value of the ratio® 


Max [2 ?o(7i I S*), Ei(n i ,S’*)] 

Max [So(7i|-S),'7!;r(n|6')] ' 

where is an optimum sequential test of strength eciual to that of S. 

2.3. Efficiency of the current procedure, viewed an a particular cone of a sequential 
test. The current test procedure can he considered as a jiarticular case of a 
sequential test. In fact, let N be the size of the sanqilc u.scd in the currcnl pro¬ 
cedure and let TF^ be the critical region on whicli the tc.st is based. Then tlui 


‘ The autlior conjectures tliat Eoin) and j!?i(jt) are exactly niinimiacd for the. sequoiitial 
probability ratio test, but he did not succeed in proving this, except for a special class of 
problems (see section 4.7). 

‘The existence of an optimum sequential test is not essential for the definition of effi¬ 
ciency, since Max [Sofa I S*), E,{n 1 5*)] could be replaced by llie greatest lower bound of 
Max [Eim \ S'),Ei(n \ S')] vith respect to all sequential tests S' of strength equal to that 
of S 



SEQUENTIAL TESTS 


125 


current procedure can be considered as a sequential test defined as follows: For 
all m < N, the regions Rm , Rm are the empty subsets of the m-dimensional sample 
space Mm , and Rm = Mm • For m = AT, is equal to Wn , is equal to the 
complement Wn oi Wn and Rn is the empty set. Thus, for the current pro¬ 
cedure we have E^iin) ^ Ei{n) = N. 

It will be seen 'later that the efficiency of the current test based on the most 
powerful critical region is rather low. Frequently it is below •^. In other words, 
an optimum sequential test can attain the same a and as the current most 
powerful test on the basis of an expected number of observations much smaller 
than the fixed number of observations needed for the current most powerful test. 

In the next section we shall propose a simple sequential test procedure, called 
the sequential probability ratio test, which for all practical purposes can be con¬ 
sidered an optimum sequential test. It will be seen that these sequential tests 
usually lead to average savings of about 50% in the number of trials as compared 
with the current most powerful test. 


3. Sequential Probability Ratio Test 


3.1. Definition of the sequential probability ratio test. We have seen in .section 
2 1 that the sequential test procedure is defined by subdividing the m-dimensional 
sample space Mm (m = 1, 2, , ad inf) into three mutually exclusive parts 

Rm , Rm and Rm The sequential process is terminated at the smallest value n 
of m for which the sample point lies either in R\ or in JJ‘„ . If the sample point 
lies in R^n we accept Ha and if it lies in R^ we accept Hi, 

An indication as to the pioper choice of the regions Rm, Rm and iBm can be 
obtained from the following considerations: Suppose that before the sample i.s 
drawn there exists an a priori probability that Ho is true and the value of thi.s 
probability is known Denote this a priori probability by po. Then the a prion 
probability that Hi is true is given by gi = 1 — i/o, since it i.s assumed that the 
hypotheses Ho and Hi exhaust all possibilities After a numlier of ob.scrvation.s 
have been made we gam additional information which wall affect the probability 
that (i = 0,1) is true. Let qom be the a posterioii probability that Ho is true 
and Qim the a posteriori probability that Hi is true after m observations have been 
made. Then accordmg to the well known formula of Bayes wo have 


(3 1) 
and^ 
(3.2) 


QOm — 




{/im — 


j ' ■ ' j *^m) (7l Plwi(Xl j ■ ‘ ) 2-7,i) 

_ ,Xm) _ 

fifoPOm(^rjL, * • ' , Him) -j- glPlm(a:i , * ' ' , 


where pimfxi , ■ ■ , x^) denotes the probability density in the i7i-dinumsisnal 
sample space calculated under the hypothesis ID (i = 0, 1),° As an abbrevia¬ 
tion for p,m(xi, ■ , Xm) we shall use simply . 


• If the probability distribution is discrete p,m(xi, ••• ,Xm) denotes the probability that 
the sample point (x, , • • • , x„) will be obtained. 



126 


A. WALD 


Let do and di be two positive numbers less than 1 and greater than Suppose 
that we want to construct a sequential test such that the conditional probability 
of a correct decision under the condition that Ila is accepted in greater than or 
equal to do, and the conditional probability of a correct decision under the 
condition that Hi is accepted is greater than or equal to di Then the following 
sequential process seems reasonable: At each stage calculate gom and gm . If 
17 im > di, accept Hj. If pom > do, accept Ho. If gim < di and gam < do, draw 
an additional observation. i?m in this sequential process is thus defined by the 
inequality gom > do, Rm by the inequality gtm > di, and Rn by the simultaneouB 
inequalities gim < di and gom < do. It is necessary that the sets iC > Rm and 
Rm he mutually exclusive and exhaustive. For this it suffices that the in¬ 
equalities 


(3.3) 

and 


Qlm 


-?LP>I1- > (ij 

ffoPon + giVin 


(3.4) 


ffOm 


Qapom 

ffoPOm + giVlm 


> da 


be not fulfilled simultaneously. To show that (3 3) and (3.4) are incompatible, 
we shall assume that they are simultaneously fulfilled and derive a contradiction 
from this assumption. The two inequalitie.s sum to 


(3.5) gim -b gam ^ di + do. 

Since go™ 4- Oim = 1, we have 

1 ^ di 4” do 

which is impossible, since by assumption d, > J (i = 0,1). Hence it is proved 
that the sets Rl, , Ri, and Rm are mutually exclusive and exhaustive. 

The inequalities (3.3) and (3.4) are equivalent to the following inequalities, 
respectively: 

(3.6) Pi? > .. ^L. .. 

POm QX I d| 

and 


(3.7) 


Plm ^ £o 1 dp 
Pom ~ gi da 


The constants on the right hand sides of (3.6) and (3.7) do not depend on m. 

If an a priori probability of Ho does not exist, or if it is unknown, the inequali¬ 
ties (3.6) and (3 7) suggest the use of the following sequential test: At each stage 


’ The restriction do ’> 1/2 and di > 1/2 are imposed because otherwise it might happen 
that the hypothesis with the smaller a posteriori probability will be accepted. 



SEQUENTIAL TESTS 1^7 

calculate pim/pom . If Pim = po« = 0, the value of the ratio pin/pom is defined 
to be equal to 1. Accept H\ if 

(3.8) ^—>A. 

Pom 

Accept Ho if ' 

(3.9) ^ < B. 

Pam ~ 

Take an additional observation if 

(3.10) B <A. 

Pam 

Thus, the number n of observations required by the test is the smallest integral 
value of m for which either (3.8) or (3.9) holds. The constants A and B are 
chosen so that 0 < B < A and the sequential test has the desired value a of the 
probability of an error of the first kind and the desired value 0 of the probability 
of an error of the second kind. We shall call the test procedure defined by (3.8), 
(3.9) and (3.10), a sequential probability ratio test. 

The sequential test procedure given by (3.8), (3,9) and (3.10) has been justi¬ 
fied here merely on an intuitive basis. Section 4.7, however, shows that for this 
sequential test the expected values Eo(n) and Ei(n) are very nearly minimized.* 
Thus, for practical purposes this test can be considered an optimum test. 

3.2. Fundamental relations among the quantities a, /3, A and B, In this section 
the quantities a, (3, A and B will be related by certain inequalities which are of 
basic importance for the sequential analysis. 

Let {aim} (m = 1, 2, ■ ■ • , ad inf.) be an infinite sequence of observations. The 
set of all possible infinite sequences (xm) is called the infinite dimensional sample 
space. It will be denoted by Af. Any particular infinite sequence •Ixm) is 
called a point of Af „ . For any set of n given real numbers Oi, ■ • • , a„we shall 
denote by Cfoi, • ■ ■ , o„) the subset of Af* which consists of all points (infinite 
sequences) (a;„j (m = 1, 2, • • ■ , ad inf.) for which ii = oi, ■ • • , . For 

any values Oi, ■ • , a„ the set C(oi, • • • , a„) will be called a cylindric point of 
order n. A subset Soi M„ will be called a cylindric point, if there exists a posi¬ 
tive integer n for which 5 is a cylindric point of order n. Thus, a cylindric point 
may be a cylindric point of order 1, or of order 2, etc. A cylindric point C(c»t, 
• • • , o„) will be said to be of type 1 if 

^ = /t(ai)A(g») • - • /i(gj ^ ^ 

Pan /o(ai)/o(aj) • •. /o(On) ~ 


' It seems likely to the author that Enin) and Ei(n) are exactly minimiaed for the se¬ 
quential probability ratio test. However, he did not succeed in proving it, except for a 
special class of problems (see section 4.7). 



128 


A. WALD 


and 

B < V'” = < .1 {m 

Pom foiav-’-MOm) 

A cylindric point C(ai. • • • , aj will bo said to be of tyjw 0 if 

__ /^®i2 "' < }i 

pan /o(®l) ' ' " faifln) 


, n — 1) 


B < 

Pan 


< A (m « 1 , 


1 ). 


and 

/offll) • • " /tt( 0 »i) 

Thus, if a sample (xi, ■ ■ • , ^n) is observed for winch C(xi, • • • , x„) is a cylindric 
point of type i, the sequential tost defined by (3.8), (3.9) and (3.10) leads to the 
acceptance of (f = 0 , 1 ). 

Let Q, be the sum of all cylindric points of tyi>e i (£ = 0, 1 ). Fur any suliset 
M of we shall denote by P,(il/) the probability of 2 / calculatoil under the 
assumption that B, is true (i = 0,1). Now we shall prove that 

( 3 . 11 ) Pr(Qo + ( 2 i) = 1 (t - 0 , 1 ) 

This equation means that the probability is equal to one that the sequential 
process will eventually terminate. To prove (3 11) we shall denote tlm variate 

by 2 , and 2 i -f • ■ + 2 ™ by (i, m « 1 , 2 , • • •, ad inf.). Further* 

more, denote by n the smallest integer for which cither Z„ > log A or Zn < 
log B. If no such finite integer n exists we shall say that« . ('learly, n i« 

the number of observations required by the sequential test and (3,11) is proved 
if we show that the probability that n = « is zero, But the latter statement 
was proved by the author elsewhere (see Lemma 1 in [- 1 )). lienee ecpuition 
(3.11) is proved. 

With the help of (3.11) we shall be able to derive some important inequalities 
satisfied by the quantities «. (9, A and B. Since for each sample (ji, • • • , x„) 
for which C(xi, • • • , a,,) is an element of Qi the inequality puifon > A holds, 
we see that 


(3.12) 


Fi(Qi) > APo(Q.) 


Similarly, for each sample (xi, - • • , x„) for which ('{xi , > ■> , x„) is a iMtint of 
Qo the inequality pijpon < B holds. Hence 


(3.13) Pi(Qo) < BPo(Qo). 

But Po(Qi) is the probability of committing an error of the fu-st kind and PAQo) 
is the probability of making an error of the second kind. Thus, we have 

(314) Pam = «, PriQo) = 



SEQUENTIAL TESTS 


129 


Since Qo and Qi are disjoint, it follows from (3.11) that 
(3.15) PoiQo) = 1 - PiiQi) = 1 - d. 

From the relations (3,12)-(3.15) we obtain the important inequalities 


(3.16) 

1 - j3 > A a 

and 


(3.17) 

/3 < 5 (1 - a) 


These inequalities can be written as 


(3.18) 
and 

(3.19) 


1 -/9 




1 — O' 


< B, 


The above inequalities are of great value in practical apphcations, since they 
supply upper limits for a and ^ when A and B are given. For instance, it follows 
immediately from (3.18) and (3.19), and the fact that 0<a<l,0</3<l that 


(3.20) 

and 




( 3 . 21 ) ^<B. 

A pair of values a and d can be represented by a point in the plane with the 
coordinates a and /S- It is of interest to determine the set of all points (a, /I) 
which satisfy the inequalities (3.18) and (3 19) for given values of A and B. 
Consider the straight lines Li and Li in the plane given by the equations 


(3 22 ) 
and 


Aa = l - 13 


(3.23) ^ = 5(1 - a), 

respectively. The line Li intersects the abscissa axis at a = ^ and the ordinate 

axis at d = 1- The line Li intersects the abscissa axis at a = 1 and the ordinate 
axis at /3 = 5. The set of all points (a, 0) which satisfy the inequalities (3.18) 
and (3.19) is the interior and the boundary of the quadrilateral determined by 
the lines Li, Li and the coordinate axes. This set is represented by the shaded 
area in figure 1 

The fundamental inequalities (3 18) and (3.19) were derived under the assump¬ 
tion that Xi, Xi, ■ • , ad inf are mdependent observations on the same random 



130 


A. WAIiD 


variable X. The independence of the observations is, however, not necessary 
for the validity of (3.18) and (3 19). In fact, the independence of the observa¬ 
tions was used merely to show the validity of (3.11). But (3,11) can be shown 
to hold also for dependent observations under very general conditions. Hence, 
if H, states that the joint distribution of a;i, a;s, • ■ • , x™ Ls given by the joint 
probability density function p,m(a:i, • • • , aim)’ (i = 0, 1; m = 1,2, • • • , od inf.) 
and if (3.11) holds, then for the sequential test of Ho against Jfi, as defined by 
(3 8), (3 9) and (3.10), the inequalities (3.18) and (3.19) remain valid. For 
instance, let >vo and Xi be two different positive values <1 and let Ht{i ~ 0, 1) 
be the hypothesis that the joint probability density function of xi, • • • , ®m is 
given by 



• • • , a;« = e~'*' „ q, 1) 

i.e., that xi and {xj — = 2, 3, • • • , ad inf.) are normally and inde¬ 

pendently distributed with zero means and unit variances, then the inequalities 
(3.18) and (3.19) will hold for the sequential test defined by (3.8), (3.9) and 
(3.10). 

3.3. Delermination of the values A and B in practice. Suppose that we wish 
to have a sequential test such that the probability of an error of the first kind is 
equal to a and the probability of an error of the second kind is equal to §, De- 

• Of oourae, for any positive integers m and »i' with m<m' the marginal distribution of 
xi, ■ ■ • , Xm determined on the basis of the joint distribution Pimi{x \, • • ■ , Xm') must be 
equal toP,m(x,, ■■■ , x„). 



BEQUBNTIAli TESTS 


131 


note by ^(a, fi) and B(a, p) the ■values of A and B for which the probabilities of 
the errors of the first and second kinds will take the desired ■values a and /3 
The exact determination of the values A(a, j8) and B{a, j3) is rather laborious, as 
■will be seen in Section 3.4. The inequalities at our disposal, however, permit the 
problem to be solved satisfactorily for practical purposes. From (3.18) and 
(3.19) it follows that 

(3.24) p) < Ljli 

a 

and 

(3 25) B(o!j fi) > -—^, 

1 — a 

X ~~ 3 3 

Suppose we put A = —— = a{a, d) (say), and B ~ " = 6 (a, j3) (say). 

Then A is greater than or equal to the exact value j4(q;, /3), and B is leas than or 
equal to the exact value Bia, 0). This procedure, of course, changes the prob¬ 
abilities of errors of the first and second kind. If we were to use the exact value 
of B and a value of A which is greater than the exact value, then evidently wo 
would lower the value of a, but slightly increase the value of d- Similarly, if 
we were to use the exact value of A and a value of B which is below the exact 
value, then we would lower the value of /3, but slightly increase the value of a. 
Thus, it is not clear what will be the resulting effect on a and ^ if a value of A is 
used which is higher than the exact value, and a value of B is used which is lower 
than the exact value. Denote by a' and (3' the resulting probabilities of errors 

of the first and second kind, respectively, if we put A = LzJ?. and B ~ . 

a I — a' 

We now derive inequalities satisfied by the quantities a', /9', « and p. Sub¬ 
stituting a{a, p) for A, b{a, for B, a' for a and /3' for /3 we obtain from (3.18) 
and (3.19) 


(3.26) 

and 


^ 1 _ Of 

1 - /S' - a(of, 0) ~ 1 ~ p 


(3.27) 

~ « 1 — a 

From these inequalities it follows that 


(3.28) 

and 


1 - /3 


< 



(3.29) 



132 


A. WAIjD 


Multiplying (3 26) by (1 — /3)(1 — /9') and (3.27) by (1 — a)(l — a') and adding 
the two resulting inequalities, we have 


(3.30) 


a' + 13^ < a + p. 


Thus, we see that at least one of the inequalities a' < a and must hold. 

In other words, by using a{a, (9) and h{a, p) instead of A (a, 0) and J5(a, ^), re- 
.spectively, at mo.st one of the probabilities a and /9 may bo inereascd. 

If a. and /3 aie small (say less than 05), a.s they freciuently will be in practical 


applications, 


a 


and 


1 - a 


are nearly eciual to a. and /3, respectively. 


Thus. 


we see from (3 28) and (3.29) that the quantity by which a! can possibly exceed 
a, or can exceed |3, must be small. Section 3.4 contains further inequalities 
Avhich show that the amount by which can po.s.sil)ly exceed a(/3) is indeed 
extremely small. Thus, for all practical purposes £>:'<« and |3' < jS. 

If/iCi:) (the distribution under the alternative hypothesis) is sufllciently near 
/o(.t) (the distribution undei the null hypothesis), j4(a, ji) and B{a, (3) will be 

nearly equal to - - and 7 -^—, respectively; and conseciuently a' and /9' are 

a 1 — a 


also very nearly equal to a and )3 respectively. The reason that (3,18) and 
(3.19) and therefoie also (3.24) and (3 25),are inequalities iu.stmul of equalities 


IS that the sequential process may terminate with > A or < B. If at 

Por. Pin 

p 

the final stage -g^" were exactly equal to A or B, then A (a, /3) and /?(a, 0) would 

“on 


be exactly 


1 ^ 

^ and ;-, respectively. If /i(x) i.s near /o(x), it is almost 


1 


certain that the value of ~ is changed only slightly by one additional observa- 

X On 


tion. Thus, at the final stage will be only slightly aliovc A, or slightly below 

On 


1 - 0 , 


0 


B and consequently A (a, 0 ) and B{a, 0) will be nearly equal to -- and , , 

a 1 — a 

respectively If fractional observations were possible, that i.s to say, if the num¬ 
ber of observations weie a continuous variable, ^ would also he a continuous 

1 0>n 


function of m and consequently A(a, 0 ) and B{a, 0) would be exactly equal to 

1 __ rt A 

- ^ - and j--—, respectively. Thus, we have inequalities in (3.24) and (3.25) 

instead of equalities merely on account of the fact that the number m of observa¬ 
tions is discontinuous, i.e., m can take only integral values 
Hence for all practical purposes the following procedure can be adopted: To 
construct a sequential test such that the proiahilily oj an error oj the first kind does 
not exceed a and the prohdbiliiy of an error of the second kind does not exceed 0, pul 



SEQUENTIAL TESTS 


133 


1^3 ^ 

A = - - arid B = and can-y out the sequential test as defined by the in¬ 

equalities (3.8), (3.9) and (3.10). 

In most practical cases the calculation of the exact values A (a, p) and B{a, p) 
will be of little interest for the following reasons: When A = a(a, p) = -- 


and B = b {a, p) 


P 

1 - a’ 


a 

the probability a! of an error of the first kind cannot 


exceed a and the probability p’ of an error of the second kind cannot exceed P, 
except by a very small quantity which can be neglected for practical purposes. 
Thus, for all practical purposes the use of a(«, p) and b(a, p) instead of A{ct, p) 
and B{a, p) wdl not decrease the strength of the sequential test. The only 
possible disadvantage from the substitution is that it may increase the expected 
number of trials necessary for a decision. Since the discrepancy betweeti A (a, p) 
and B{a, p) on the one hand and a(a, p) and b{a, p) on the other, arises only 
from the discontinuity of the number m of observations, it is clear that the in¬ 
crease in the expected number of trials caused by the use of a{a, p) and h{a, p) 
will be slight. This slight increase, however, cannot be considered entirely a 
loss for the following reason: if o(a, p) > 4(a, p) or b{a, P) < B{a, P), then we 
can sharpen the inequality (3.30) to < a -b /3. Hence by using a(a, p) 

and 6(a, p) we gain in strength. 

The fact that for practical purposes we may put A = a{a, p) and B = 
b{a, P) brings out a surprising feature of the sequential test as compared with 
current tests. While current tests cannot be carried out ivithout finding the 
probability distribution of the statistic on which the test is based, there are no 
distribution problems in connection with sequential tests. In fact, a(a, p) and 


b(,ct, P) depend on a and p only, and the ratio — can be calculated from the data 

POm 

of the problem without solving any distribution problems. Distribution prob¬ 
lems arise in connection with the sequential process only if it is desired to find the 
probability distribution of the number of trials necessary for reaching a final 
decision (This subject is discussed later.) But this is of secondary importance 
as long as we know that the sequential test on the average leads to a saving in 
the number of trials. 

3.4. Probability of accepting Ho (or Hi) when some third hypothesis H is 'intc. 
In Section 3 2 we were concerned with the probability that the sequential prob¬ 
ability ratio test will lead to the acceptance of Ho (or Hi) when Ho or H i is true. 
Since in Part II we shall admit an infinite set of alternatives, and since this i.s 
the practically important case, it is of interest to Study the probability of accept¬ 
ing Ho (or Hi) when any third hypothesis H, not necessarily eciual to Ho or Hi, 
i.s true. Let H be the hypothesis that the distribution of X is given by fix). 
If fix) is equal to fo(x) or fiix) we have the special case discussed in Section 3.2. 
In nhat follows in this and the .subsequent sections any probability relation,ship 



134 


A. WALD 


will be stated on the assumption that H is true, unless a statement to the con¬ 
trary is explicitly made. Denote by y the probability that the sequential prob¬ 
ability ratio test will lead to the acceptance of llx Clearly, ii H Ho, then 
y = a and if if = Zfi, then 7 = 1 — /S. 

The probability 7 can readily be derived on the basis of the general theory of 

f {x ‘) 

cumulative sums given in [4]. Denote log by g,. Then (i =» 2 , • • • , 

ad inf.) is a sequence of independent random variables each having the same dis¬ 
tribution. Denote by the sum of the first j elements of the sequence {z<] i.e., 

(3.31) Zj = Zi+ + Zj (j = 1, 2, • ■ • , adinf.)‘ 

For any relation R we shall denote by P(fJ) the probability that R holds. For 
any random variable Y the symbol BY will denote the expected value of Y. 
Let n be the smallest positive integer for which either Zn > log A ot Z„ < log if 
holds. If log B < Zm < log A holds for m = 1, 2, ■ ■ • , ad inf., we sliall say that 
n = 00 . Obviously, n is the number of observations required by the sequential 
probability ratio test. As we have seen in Section 3.3, in practice wo shall put 
^ ^ 

A = a(a, fl) = - - and B = Ma, fi) = r-. Since B must be leas than A, 

a 1 ~ a 

1 — fl P 

we shall consider only values a and (3 for which- - > - -. This inequality 

a I — a 

is equivalent to a -f- < 1, which in turn implies that if < 1 and A > 1 , Thus, 

in all that follows it will be assumed that A > 1 and B < 1 . We shall also 
assume that the variance of z, is not zero. 

Accordmg to Lemma 1 in [4] the relation P(n = 00 ) = 0 holds. Hence, the 
probability is equal to one that the sequential process will eventually terminate. 
This implies that the probability of accepting Ho is equal to 1 — y. 

Let z be a random variable whose distribution is equal to the common dis¬ 
tribution of the variates Z( (f = 1 ,2, • • • , ad inf.). Denote by v(0 the moment 
generating function of z, i.e., 

v (0 *= Be". 

It was shown in tf] that under very mild restrictions on the distribution of z 
there exists exactly one real value h such that h 5 ^ 0 and ipQi) = 1 . Furthermore, 
it was shown in [4] (see equation (16) in [4]) that 

(3.32) Be*"* « 1. 

Let E* be the conditional expected value of e*"* under the restriction that Ho 
is accepted, i.e., that Z„ < log B, and let E** be the conditional expected value 
of e*"* under the restriction that Hi is accepted, i.e., that Zn > log A. Then we 
obtain from (3.32) 

(3.33) (1 - y)E* -f yE** = 1 


The probability that Ha will be accepted is equal to 1 — 7 , as will be seen, later, 



Solving for 7 we obtain 
(3.34) 


SEQUENTIAL TESTS 


135 


7 = 


E* 

E** - E* ■ 


If both the absolute value of Ez and the variance of z are small, which will be the 
case when fiix) is near U{x), E* and E** will be nearly equal to and A**, re¬ 
spectively. Hence, in this case a good approximation to y is given by the ex¬ 
pression 


(3.35) 


7 = 


1 - JB* 
A* - B’'' 


It is easy to verify that h = 1 ii H = Ho, and h = —liiH = Hi. The differ¬ 
ence 7 — 7 approaches zero if both the mean and the variance of z converge to 
zero. 

To judge the goodness of the approximation given by 7 , it is desirable to de¬ 
rive lower and upper limits for 7 . Such limits for 7 can be obtained by deriving 
lower and upper limits for E* and E**. First we consider the case when h > 0. 
Let f be a real variable restricted to values > 1, and let p be a positive variable 
restricted to values <1. For any random variable Y and any relationship R 
we shall denote by E(Y J R) the conditional expected value of Y under the re¬ 
striction that R holds. It was shown in [4] that the following inequalities hold 


(3.36) 

and 


g.i.b 

f 




E* <B'' 


(/i > 0) 


(3.37) A' < E** < A' |l.u.b. pE | e'* > ^ (A > 0). 

The symbol g.i.b. stands for the greatest lower bound with respect to {*, and the 

f 

symbol l.u.b. stands for least upper bound with respect to p. Putting 


(3.38) 

and 


g.l.b.fJ^(e*'|e*' 


(3.39) l.u.b. pE (e** | «*' > 

the inequalities (3.36) and (3.37) can be written as 


(3.40) B\<E*< J3* {h > 0) 


u Sc© relations (23) and (26) in [4]. The notation used here la somewhat different from 
that in [4]. 



136 


A. WALD 


and 


(3 41) 


A’' < E** < -4'i * 


(h > 0 ). 


Since B < I and A > 1, vve see that E* < I and E** > 1 if /( > 0. From 
this and the relations (3.34), (3.40) and (3.41) it follows easily that 

(-) J'-O) 

If h < 0, limits for 7 can be obtained as follows: Let z' » ~z, .4' = B' = 


Then k' ~ —h > 0 and 7 ' = 1 


(3.43) 


1 - iB’f 

fi'CA')*' ~ (B'f 


— 7 , Thus, according to (3.42) we have 


<y' < 


1 ~ r ,>i B')^' 
CA')*' - ri'iBf 


where S' and ?)' are equal to the expressions wm obtain from (3.38) and (3.36), 
respectively, by substituting h' for h and z' for z. Since rj and 5 dejx’nd only on 
the product kz ~ h'z', we see that S' — S and if — i]. Hence, we obtain from 
(3.43) 


(3.44) 


1 - A* ^ ^ 1 ~ 

- A'- - - W- 


{h < 0 ) 


where 5 and t> are given by (3.38) and (3.39), re.spectively. 

In Section 3,5 we shall calculate the value of rj and 5 for binomial and normal 
distributions. If the limits of 7 , os given in (3,42) and (3.44), are too far apart, 
it may be desirable to determine the exact value of 7 , or at lea-st to find a closer 
approximation to 7 than that given in (3.35). A solution of this problem i.s 
given in [4] (see section 7 of that pairer). There tlie exact value of 7 i.s derived 
when z can take only a finite number of integral multiples of a constant d. If z 
does not have this property, arbitrarily fine approximation to the value of 7 
can be obtained, since the distribution of z can be approximated to any dc.sired 
degree by a discrete distribution of the type mentioned before if the constant d 
is chosen sufficiently small. The results obtained in [4] can be stated as follows: 
There is no loss of generality in assuming that d = 1 , since the (luantity d can 
be chosen as the unit of measurement. Thus, ive shall assume that z takes only 
a finite number of integral values. Let gt and pa be tw'O posltu'c integers .such 
that P(z = —g,) and P(z = ^j) are positive and z can take only integral values 
> —gi and < 52 . Denote P(z = t) by h,-, Then the moment generating 
function of z is given by 

02 

<fi(0 = ij hic", 

I --01 

Put u = e' and let lii, ■ ■ • Ug be the g = gi + On roots of the equation of p-th 
degree 

(3-45) Ji-u' = 1 . 

'”-01 



SEQUENTIAL TESTS 


137 


Denote by [a] the smallest integer > log A, and by [6] the largest integer < log B. 
Then Z„ can take only the values 

(3.46) [h] — < 7 i + 1, [b] — S'] + 2, • ■ ■ , {h], [a], [a] + 1, • • • , [a] + £/* — 1. 

Denote the g different integers in (3.46) by ci, ■ - ■ , c,,, respectively. I^et A bo 
the determinant value of the matrix H Ui’ j| (f, i = 1 , • • ■ , ^) anti let Aj be the 
determinant we obtain from A by substituting 1 for the elements in the ji-th 
column. Then, if A 0, the probability that Zn = Cj is given by 


(3.47) 
Hence 

(3.48) 


P(Z„ = cs) = 


7 = P(Zn > [a]) = E 

J ^ 


where the summation is to be taken over all vaues of j for which C/ > [o]. 

3.5. Calculation of 'S and ij for binomial and normal dislribtUions. Let X be a 
random variable which can take only the values 0 and 1. Let the probability 
that X = 1 be p, if 77, is true {i = 0, 1), and p if 7/ is true. Denote 1 — p by 5 
and 1 - p, by q, (i = 0,1). Then /.(I) = p. ;/,'(0) = ,/(l) = p and/(0) = q. 

It can be assumed without loss of generality that pi > po. The moment generat¬ 
ing function of 2 = log is given by 
/o(a;; 

Let h 0 be the value of t for which <p{h) = 1 , i.e., 

First we consider the case when h > 0. It is clear that e'* = () 

\/o(a:)/ 

= & 


plies that x = 1. Hence > 1 implies that e’’' = ^ 
this and the definition of 5 given in (3.39) it follows that 

(3.49) 


^o(l)/ 


> 1 im- 
Frora 


*-(g)‘ »>«• 

Similarly, the inequality e'* < 1 implies that e'* = ( ?-*) . From this and the 

\3o/ 


definition of tj given in (3.38) it follows that 



138 


A. WALD 


If < 0, it can be shown in a similar way that 


(3.61) 


(A<0) 

and 



(3.52) 


(h < 0). 

Now we shall calculate the values of 5 and if X is normally distributed. Let 

(3.63) 


{i - 0,1) 

and 



(3.64) 




We can assume without loss of generality that ~ A and =» A wherh A > 0, 

since this can always be achieved by a translation. Then 


(3.65) 2 => log-^^ =* 2Aa;. 

Mx) 

The moment generating function of r is given by 

(3.66) v{t) « 

Hence 

(3.57) 

a 

Substituting this value of h in (3.38) and (3.39) we obtain 

(3.58) s = l.u.b. pE(^e~^^ | e""* > 
and 

(3.59) n = g.l.b. iE (e""* | ^ . 

For any relation R let P*{R) denote the probability that the relation R holds 
calculated under the assumption that the distribution of a* is normal with mean 
e and varian^ unity. Furthermore, let P**(f2) denote the probability that R 
if the distribution of x is normal with mean — 9 and variance unity. Since 
c is equal to the ratio of the normal probability density function with mean 
— 9 and variance unity to the normal probability density function with mean 6 
and variance unity, we see that 



SEQUENHAIi TESTS 


139 



It can easily be verified that the right hand aide expreesiona in (3.60) and 
(3.61) have the aattievalues for 5 = Xaa for^ = —X. Thus, also S and have the 
same values for 0 = X as for = — X. It will be, therefore, sufficient to compute 
8 and ri for negative values of 6 . Let 0 = —\ where X > 0. First we show that 

V = y Clearly 


(3.62) 



rP**(e-*^* > 

> f) 


(1 < r < «). 


Putting f 


(3.63) 


- (0 < p < 1) in (3.62) gives 
P 

j-p** 0 .p** ^g-2Xr ^ 

P* pP* 


Hence 


(3.64) 


V = 



1 

p*('g2X. < A 

r , 

pP* > i) 

1. \ s/ . 

’ l.u.b. < 

- 7 - -fx 


P 

p** (e-jx, > M 



\ P/J 


Because of the symmetry of the normal distribution, it is easily seen that 


pP* 

' = l.u.b. 1 

P 

\pP** 

p** (g-A. > ]\ 

p*(ea, > A 

\ PJ. 


^ V P/J 



140 


A, -WALD 


Hence 

1 

(3.65) V 

1 r"* 

Now we shall calculate the value of S* Denote j ^ ^ » "I h^u 


p** 




p** 


^2Xx > log 0 

= P** ^ ^ = Gr ^2\ log ~ ~ 


Similarly 


p.(.->l).p.(,>ilogi).ff(ilog)+x). 


Denote ~ log - by u. Since p can vary from 0 to 1, « can take atiy value from 

2a p 


0 to CO. Since p = we have 


(3.66) 5 = l.u.b, 


pf^** 




f «= l.u.b. sc 

u 


0(ti -p \)" 


(0 < u < w). 


We .'jhall prove that 

(’■"> " g(5Tx) 

is a monotonically decreasing function of u and consequently the maximum is 
at M = 0. For this purpose it suffices to show that the derivative of log x(m) 
is never positive. Now 


e = x{«) (say) 


( 3 . 68 ) 


log x(m) = log Oiu — X) — log Giu + X) — 2Xu. 


Denote e **“by$(ar). Since = “'!>(») it follows from (3.08) that 


(3.69) 


du 


logx(M) 


X) ■*“ £?(« 4- X) 


2X. 


It follows from the mean value theorem that the right hand side of (3.69) is 
. . , d /<i(u) \ 

never positive if “ yg^J ^ values of u. Tlius, 

we need merely to show that 



SEQUENTIAL TESTS 


141 


d /^(u)\ _ $'(u)G(u) — (?'('w)^(w) 


(3 70) 


du \(7(m)/ 


G^u) 


^'{u)0{u) + ^^(u) _ ^^{u) _ 4>(u) ^ , 

~ Oiu) - ' 


Denote by y. The roots of the equation — uy — 1 = 0 are 
G{u) 


y = 


GKu) 

ion 

u ± "v/ u* -)- 4 


Hence the inequality — uy — \ 0 holds if and only if 

u — '\/u^ + 4^ ^ u V? 4^ 

r\ — y — c% * 


Since y cannot be negative, this inequality is equivalent to 


4?{u) _ ^ u + -y/u^ + 4 

G(u) ^ - 2 


Thus we have merely to prove (3.71). We shall show that (3.71) holds for 
all real values of u. Birnbaum has shown [6] that for u > 0 


(3.72) 
Hence 

(3.73) 


■y/u^ 4 — u 

2 


4’(w) < Giu). 


4>('u) ^_2_ \/u^ -j- 4 + u 

G^) - Vu" +4-1* 2 


{u > 0 ) 


which proves (3 71) for u > 0. Now v.'e prove (3 71) for u < 0. Let u = — v 
where y > 0. Then it follows from (3 73) that 


(3.74) 


^^) ^ _ j _ 

G(o) - V4 + y= - y 


Taking reciprocals, we obtain from (3 74) 


(3.75) 

Since 


G(.v) ^ V-l + !/•= - u 

^(i') - 2 


G{u) ^ (?(y) + 2y<l>((0 ^ «(w) 
<I>(u) ~ <i>(y) 'h(y) 


we obtain from (3.75) 


(3.76) 


G{u ) ^ \/y^ + 4 + 3y Vy- + 4 + y 
4>(«) ~ 2 "■ 2 



142 


A, WALO 


Taking reciprocals, we obtain 

^(u) > 2 ■\/ t)j If- 4 — i' %/u* + 4 + u 

G{u) ""'\/y* + 4 + y 2 2 

Hence (3.71) is proved for all values of k and consequently S is wtual to the value 
of the expression (3.67) if we substitute 0 for n. Thus, 


(3.77) 


& S 3 


C?(-X) 
■f?(A) ' 


4. The Number of Observations Required by the Sequential Probability 

Ratio Test 

4.1. Expected number of observatwns necessary for reaehinff a decision. As 
before, let 

' " " "’**5 (i -1,2, ■ ■ •, »d w.) 

and let n be the number of observations required by the sequential test, i.e., n is 
the smallest integer for which Z„ = ei +• - •+ z„ in either >log /I or <log B, 
To determine the expected value E(n) of n under any hyjKithesis II we shall 
consider a fixed positive integer N. The sum if;* = ?i -f • • • + zy can be split 
in two parts as follows 

(4.1) Z>,^Zn-\- Z; 

where = z«+i + • • • 4 - if n < AT and =» if n > iV, Taking 

expected values on both sides of (4.1) we obtain 

(4.2) NEz = EZn + EZ '^. 

Since the probability that n > N converges to zero as )V —* co, and since 
\Z'„\ < 2 (log A + I log S 1) if n > Af, it can !» seen that 

(4.3) lim [EZ'„ - E{N - n)Ez] - 0. 

From (4.2) and (4,3) it follows that 

(4.4) EZn = EnEz . 

Hence 

(4.5) En « ^ 

Ez 

Let E*Z„ be the conditional expected value of Z„ under the restriction that the 
sequential analysis leads to the acceptance of Ht ,, i.e. that Z„ < log B. Simi¬ 
larly, let F**Z„ be the conditional expected value of Z„ under the restriction that 
Hi is accepted, i.e., that Zn ^ log A, Since 7 is the probability that Zn '2l log A, 
we have ~ 


(4.6) 


BZ. = (1 - 7)H*Z„ -h yE**Z„ . 



SEQUENTIAL TESTS 


143 


From (4.5) and (4.6) we obtain 

(4.7) = 

Ez 

The exact value of EZn , and therefore also the exact value of En, can he com¬ 
puted if 2 can take only integral multiples of a constant d, since in this case the 
exact probability distribution of was obtained (see equation (3.47)). If 2 
does not satisfy the above restriction, it is still possible to obtain arbitrarily fine 
approximations to the value of EZn y since the distribution of 2 can be approxi¬ 
mated to any desired degree by a discrete distribution of the type mentioned 
above if the constant d is chosen sufficiently small. 

If both I Ez I and the standard deviation of 2 are small, B*Z„ is very nearly 
equal to log B and E**Zn is very nearly equal to log A . Hence in this case we 
can write 

(4.8) En ~ (1 - y) logB + ylogA 

Ez 

To judge the goodness of the approximation given in (4.8) we shall derive lower 
and upper limits for En by deriving lower and upper limits for E*Z„ and E**Z„ . 
Let r be a non-negative variable and let 

(4.9) f = Max E{z — r 1 2 > r) (r > 0) 

r 

and 

(4.10) = Min E{z -|- r I 2 r < 0). (r > 0) 

It is easy to see that 

(4 11) log A < E**Zn < log A -i- { 

and 

(4-12) log 5 -h {' < E*Zn < log B. 

We obtain from (4.7), (4.11) and (4.12) 

0 ^y){\ogB + n + yhgA < < (1 -7)l - )gil-l-7(l QK A + f) 

Ez - Ez " 

if Ez > 0 

(1 - 7 ) ]ogH-f7(log.l 4 -^) < g < (1 - 7)(logJI + {') + T log 

if Ez < 0. 

4 2. Calculation of the quantities f and for binomial and normal distributions. 
Let X be a random variable which can take only the values 0 and 1. Ut the 
probability that X = 1 be p, if is true (t = 0, 1), and p if H is true. Denote 


(4.13) 
and 

(4.14) 



144 


A. WALD 


1 — p by 5 and 1 — p, by q, (i = 0, 1). Thi>n/,(1) = p, ,/.(0) = 9 ,, /(I) = p 
and /(O) = q. It can be assumed without loss of generality that pi > po. It 

is clear that log > 0 implies that x = 1 and consequently log = log 
/oW Jow 

= log - . Hence 

JoU; 


(4.15) 


po 


.hix) 


f = Max E(z — r I z > ?■) => log 


Pi 

Po' 


Since log < 0 implies that e = 0, we have 


(4.16) 


f' = Min E(z + r|z+r <0) = log . 

r 90 


(t = 0,1) (fli > e,) 


Now we shall calculate the values { and if X is normally distributed. Let 

and 


/(*) =-7x^C 


We may assume without loss of generality that 0o = — A and — A where 
A > 0, since this can always he achieved by a translation. Then 


(4.17) 


. - = 2 ^. 


1 1 r" 

Denote by4>(x) and J e"*’'* by (7(x). Ixit t — x ~~ B. 

Then z = 2A(i + B) and 


(4.18) 

where 

(4.19) 


E(z - rjz - r > 0) = 2 Ab(^«+ 0 - ^ ^ ^ 2^ - °) 




In section 3.6 (see equation (3.70)) it was proved that - k is & monotoni- 

cally decreasing function 'oi . Hence the maximum of B(z - r ] z - r > 0 ) 
is reached for r = 0 and consequently 

W(-*> + - 24[» + . 



SEQUENTIAL TESTS 


145 


Now we shall calculate f'. We have 

= Min E{z r\z + T <0) = —Max E{—z ~ r | -s — r > 0) 

(4.21) 


= — 2A Max 






o). 


Let t = —X + 6 and ~ ^ "t" Then 


E 


(4 22) 




Since this is a monotonically decreasing function of io , we have 

4>(e) 


G(0) 


e. 


(4.23) MaxB(-x-^|-rn-^>0) = 

From (4.21) and (4.23) we obtain 

(4.24) {'--2a[||>-.]. 

4.3, Saving in the numbei of observations as compared with the current test 
procedure. We consider the case of a normally distributed variate, such tliat 


M^) = 


'\/2w 


-Ht-Oo)' 


and 


/i(x) = 


^ -Kx-Oi)* 


(Si ^ ef). 


Denote by n{a, /3) the minimum number of observations nece.'^saiy in the cuirent 
most powerful test for the probabilities of errors of the first and second kind.s 
to be a and 0, respectively, or less. 

We shall calculate the number of observations required by the mo.st powerful 
test. It can be assumed without loss of generality that Bo < fli According 
to the current most powerful test pioccdure the hypothc.sis Ho i.s accepted if 
X < d and the hypothesis Hi is accepted if 5 > d, where .f is the ai'ithmetic 
mean of the observations and d is a properly cho.scn constant. Tlie proliabilily 
of an error of the first kind is given by G[Vn(d — 0^)] and the probability of an 
error of the second kind is given by 1 — G[\/n(d ~ 0|)1 where (i{l) = 


1 2 ? 

^^ 7 ^ J '' dx. To equate these piobabilities to a and 0, respectively, tlie 
quantities d and n must satisfy 


( 4 . 25 ) 


G[\/ n(.d — 6o)] = “ 



146 


A. WALD 


and 

(4.26) 1 “ GiVnid - 5i)] - P. 

Denote by and Xi the values for which (?(Ko) = a and <7(Xi) « 1 ■- /3. Then 
we have 

(4.27) Vn(d — So) <= Xo 
and 

(4 28) Vn(d — Si) =* Xi. 

Subtracting (4.27) from (4.28) we obtain 

(4.29) '\/n(So ~ Si) = Xj — Xo. 

From (4,29) 

(4.30) n = n(a, p) = • 

If the expression on the right hand side of (4.30) is not an integer, n(a, P) is the 
smallest integer in excess. 

Q 

In the sequential probability ratio test we put -4 » a{a, /?)=*.-- -° and 

a 

3 = h(oc, p) = ^“77^ • Then the probability of an error of the fimt (second) 

kind cannot exceed <x{p) except by a negligible amount. Ix;t /I (a, p) and 
B{a, P) be the values of A and B for which the probabilities of errors of the first 
and second kinds become exactly equal to a and P, respectively. It has been 
shoivn in Section 3.2 that A (a, p) < a{a, p) and B{a, P) > b(a, p). Thus, the 
expected values JSi(n) and Eo{n) are only increased by putting A =« a (a, p) and 
B = b ia, p) instead of A = A (a, p) and B = B (a, P), 

Consider the case where | — flo | is small so that the quantities f and can 

be neglected. Thus, we shall use the approximation (4.8). Since y == a if 7/ = 
Ht and 7 = 1 — /3 if 77 = 77i, we obtain from (4.8) 


(4 31) 
and 


E,(n) = 



- P 


a* + 16* I 
Eiii) 




and 

(4.34) 


SEQUENTIAL TESTS 


147 


it follows from (4.30), (4.31) and (4.32) that and ~ are independent 

n{a, ff) n{a, (3) 

of the parameters do and 0i. 

TABLE 1 

Average percentage saving of sequential analysts, as compared with current most 
powerful teat for testing mean of a normally distributed variate 
A. When alternative hypothesis is true: 



.01 

.02 

.03 

(H 

,05 

.01 

58 

60 

61 

62 

63 

.02 

54 

56 

57 

58 

59 

.03 

51 

53 

54 

55 

55 

.04 

49 

50 

51 

52 

63 

.06 

47 

49 

50 

50 

51 


B. When null hypothesis is true: 



.01 

1 

.02 

.03 

04 

.05 

.01 

58 1 

54 

51 

49 

47 

.02 

60 

56 

63 

50 

49 

.03 

61 

57 

54 

51 

50 

.04 

62 

58 

65 

52 

50 

.05 

63 

59 

1 

55 

53 

51 


The average saving of the sequential analysis as compared with the current 

W \ . .. .. ( Eo(n) \ 

\ n(a, P)) 


method is 100 


_ W\ 

\ n(a, /3)/ 


per cent if Hi is true, and 100 


per 


( E (ti) \ 

1 ~ ) is shown in Panel 

n(a, P)/ 

/ E (ti) \ 

A, and the expression 100 (1 — in Panel B, for several values of a and p, 

Because of the symmetry of the normal distribution, Panel B is obtained from 
Panel A simply by interchanging « and 



















148 


A. WAI.D 


As can be seen from the tnlilo, for the range of a and /3 from .01 to .05 (the 
range most frequently employed), the .se(iuential_ proces.s lead.s to an average 
saving of at least 47 per cent in the neceasary number of observations as com¬ 
pared with the current procedure. The true .saving i.s slightly greater than shown 
in the table, since B,(n) calculated under the condition that A ~ a {a, and 
B ~ h {a, 0) is greater than Eiiii) calculated under the. condition that A == A 
(ot, |8) and = J3 (a, /3). 

4 4. The characteriittic function, the moments and the dislTibuUim of the number 
of observations necessary for reaching a decision, It was .shown in [4] (sec equa¬ 
tion (15) in [4]) that the following fundamental identity holds 

(4.35) E{e''"'[^(0r'‘) = 1 (v>(i) - A’c“) 

for all points t of the complex plane for which <p{l) exists and | v>(f) | > 1. The 
.symbol n denote,s the number of observations required by the .sefpiential test, 
i e., n is the smalle.st positive integer for which Z„ is either > log A or < log B, 
and tp(t) denotes the moment generating function of z. 

On the basis of the identity (4.35) the exact characteristic function of n i.s 
derived in section 7 of [4] in the ease when z can take onljf integral multiples of 
a constant. If the number of different values which Zn (‘iin take is large', the 
calculation of the exact characteristic function is cumbemome, because a large 
number of .simultaneous linear equations have to he solved. However, if j Ez 1 
and a, are small so that 1 Z„ — log A j (when Z„ > log .4) and | Z„ — log B \ 
(when Zn < log B) ean be neglceted, llie ealculation of the eluiraeteristic func¬ 
tion is much simpler, a.s was shown in [4]. We .shall briefly stab* the results 
obtained in [4]. Let h be the real value 0 for which ip{}i) — 1. Furthermore 
let i = tiir) and t = iAr) be the roots of the eciuation in t 

-logvj(f) = T 

such that lim Ii(t) = 0 and Urn < 2 (t) = h. Finally, let ^i(t) the eharaeler- 

T~.0 f=Q 

istic function of the conditional dustrihution of n under the lestricUon that Z„ > 
log A, and i/'A^) the characteiistic function of the conditional distribution of n 
under the restriction that Z„ < log B Then, if 1 - log A 1 (when Z„ > 

log A) and | Z„ — log B | (when Z„ < log B) can he neglected, i/'Ar) and |/'■i(^) are 
the solutions of the linear equations 

(4 36) #i(r)A'‘''> + (1 - y)Ur)B'"’^ = 1 

and 

(4.37) 7l/'i(r)A'^<^> -t- (I - y}Ur)B'^'’^ - I 
where 

1 ~ 

7 = PiZ„ > log A) = . 

The characteristic function of the unconditional distribution of n is 

(4.38) f(r) = yxf^r) -b (I - y)Mr). 



SEQUENTTAL TESTS 


149 


As an illustration we shall determine i^sCr) and i/'(t) when z has a normal 
distribution. Then we have 

- log¥>«) = -W - 


Hence 

(4.39) 


(4 40) 



h(T) — 2 ( —Hz + \/(jE1z)^ — 2o-jt). 

^ 2 

Ht) = 4 i-Ez - V(Ezf ~ 2 v ^ t ). 

^ t 


From (4.36), (4.37) and (4.38) we obtain 


(4 41) 


yipiir) 


- B"' 
A"'S'* - 


(4.42) 
and 

(4.43) 
where 
(4 44) 
and 
(4.45) 


(1 — y)ii{T) =• 


A" - A'* 

- A"*S'‘ 


,, , A'’ + S'* - A'* - S'* 

s - 

(7i = 4 

Vj 

ff 2 = 4 (-£^2 - V ( S 2 y - 2 a -: r ). 


For any positive integer r the r-th moment of n i.e., S(n'') is equal to the r-th 
derivative of ^(t) taken at r = 0. Let E*{n) be the conditional exiwcted value 
of rC under the restriction that < log S, and let E**{n") be the conditional 
expected value of n’’ under the restriction that Z„ > log A. Then 


(4.40) E*{n') = 


and E**{n^) = - 

dr' 


it*^ 


ir f / \ 

It may be of interest to note that 


(k = 1, 2) and tliercforc also tlie 


moments of n can be obtained from the identity (4,35) directly by successive 
differentiation. In fact, the identity (4 35) can be written as (neglecting the 
excess of over the boundaries log A and log S) 


(4 47) yA^il—logip(0] + (1 — y)B‘^pi[—\og<pil)] = 1. 



160 


A. WAM) 


Taking the first r derivatives of (4.47) with respect to {ivt f = 0 and t ^ h 
we obtain a system of 2r linear equations in the 2 r unknowns - I ..4 ™ 

1 , 2;j =1, • ■ • , r) from which theses unknowns can he determined. For example, 
{k = 1 , 2 ) can be determined as follows: Taking the first derivative 

dr r-4t 

of (4.47) with respect to t and denoting by ^["^’(r) wt obtain 


7 (Jog A)A'^i[~log v5(0! - 7 -d' ¥>(01 


(4.48) + (1 - 7)(log B)BVil"'log v9(i)l 

- (I -7)B'!^^|^f»[-logv=(0l -0. 

Putting i = 0 and i — h-wG obtain the equations 


(4.49) r log A - 7 bvHO) + (1 - 7 ) log B ~ (I 

V’(o) 

and 




7 (log A)A^ — 7^4 


(4.60) 


v{h) 




+ (l-7)(logB)B‘- (1 


7)B" fi‘>(0) 


from which ^*^'( 0 ) and ^a*’( 0 ) can be determined. 

The distribution of n can be obtmned by inverting the characteristic function 
of i/'(t). This was done in [4] (neglecting the excess of Zn over log A and log B) 
in the case when z is normally distributed. The results obtained in [4] can be 
briefly stated as follows; If B = 0, or if B > 0 and A = ®, the distribution 
■of n is a simple elementary function. If B = 0 and Ez > 0, the distribution of 
1 

m = ^ {Ezfn is given by 


(4.61) F{m) dm » (0 < w < «) 

where 

(4.52) c =» -4 {Ez) log A. 

<r» 

1 

If B > 0 , A = 00 and B 2 < 0 the distribution of m = ~—j (Ez)^n is given by the 

. 1 
expression we obtain from (4.51) if we substitute — (Ez) log B for c. 

If B > 0 and A < w, the distribution of m is given by jin infinite series where 
each term is of the form (4.51) (see equation (76) in [4]). 



SEQUENTIAL TESTS 


151 


Since m is a discrete variable, it may seem paradoxical that we obtained a 
probability density function for m. However, the explanation lies in the fact 
that we neglected the excess of over log A and log B which is zero only in the 
limiting case when JSz and <r, approach iero. 

The distribution of m given in (4.51) can be used as a good approximation 
to the exact distribution of m even if B > 0, provided that the probability that 
> log A is nearly equal to 1. 

It was pointed out in [4] that if | Bz | and v, are sufficiently small, the distribu¬ 
tion of n determined under the assumption that z is normally distributed will 
be a good approximation to the exact distribution of n even if z is not normally 
distributed. 

4.5. Lower hmil of the probability that the sequential process will terminate with 
a number of trials less than or equal to a given number. Let Pt(Mo) be the prob¬ 
ability that the sequential process will terminate at a value n < no, calculated 
undfer H, (i = 0, 1). Let 

(4.53) Po(rk) = Po [E 2- < log b] 
and 

(4.54) Pi(no) = Pi 1^2 «« ^ log '■i J • 

It is clear that 


(4.55) Pi(no) < P,(no) (t = 0, 1). 

no 

For calculating P,(na) we shall assume that no is sufficiently large so that 2 *<» 

Oi-l 

can be regarded as normally distributed. Let G{\) be defined by 


(4.56) 


G(X) = 


1 /•" 


dt. 


Furtheimore, let 


(4 57) 


^ ^ _ log^ - WoBi(z) 

Ai(nc;-=—— 

V 


(4.58) X,(».) - 

VUo<^o(z) 

where tr,(z) is the standard deviation of z under B, . 

(4.59) Pi(no) = G[X,(no)] 


Then 


and 

(4.60) Po(no) = 1 - G[Xo(no)]. 
Hence we have the inequalities 

(4.61) Pi(no) > G[Xi(no)] 



152 


A. WALD 


and 

(4.62) PoCwtt) > 1 - <?lXo(7ia)l. 

Putting log A = log - and log li = log : — , Tahlf 2 hIuiwh tho valuo.s 

of Pi(no) and Po(no) corresponding to diftorenl pairs («, d) and fUfferent valups 
of no. In these calculations it has been lussumcd that the ilislrihution under 
Ifo is a normal distribution with mean zero and unit variairce, and the distribution 
under Hj is a normal distribution with mean fl and tmit vanunce. For each pair 
(«, (3) the value of 0 was determined bo that the numbiw of obmu-N-ations required 
by the current most powerful test of strength (a, d) is equal to UXKl. 

TABLE 2 


Lower bound of the probahiltbj* that a sequmlial minlynix will irrmmalc xoilhin 
various numbers of trials, when, thr most powrrfxd current 
test requires exaedly 10th) trials 



a =3 .01 and h “■ .01 

O' “> .01 and d =” .05 

a « .05 and d .W> 

Number of 
trials 

Alternatiyo 

hypothesis 

true 

Null 

hypothesis 

true 

Alternative 

hypothesis 

true 

Noll 

hypolhi'sis 

true 

AUerrmtivi’i Null 

hyiiothi'Kia! liypotheais 

1 rut> true 

1000 

.910 

.910 

.799 

.891 

.773 

.773 

1200 

.950 

.950 

.871 

.932 

.837 

.837 

1400 

.972 

.972 

.916 

.957 

. 8^3 

.883 

1600 

.985 

.985 

.946 

.972 

.915 

.915 

1800 

.991 

.991 

.965 

.982 

.938 

.938 

2000 

.996 

.995 

977 

.989 

.955 

.955 

2200 

.997 

.997 

.985 

.993 

.5)07 

.967 

2400 

.999 

.999 

.990 

.09.5 

.976 

976 

2600 

,999 

.999 

.994 

.997 

.982 

.982 

2800 

1.00 

1.00 

.990 

.998 

.987 

.987 

3000 

1.00 

1.00 

.997 

.999 

.990 

.990 


* The probabilities given are lower bound.s for the. true probabilitlCB. They 
relate to a test of the mean of a normally difitribuled variate, the difference lie,' 
tween the null and alternative hypothesKs being adjusted for each pair ofvalue.s 
of a and d so that the number of trialH required under liie rno.st powerful current 
test is exactly 1000. 

4.6. Truncated-sequential analysts. In eomc aptilication.s a det'mite upper 
bound for the number of observations may be desirable. Thus, a certain 
integer no is chosen so that if the sequential process docs not load to a final 
decision for n < no, a new rule is given for the acceptance or rejection of FL 
at the stage n = . 

A simple and reasonable rule for the acceptance or rejection of IIb at the stage 

»»0 rtg 

n = no can be given as follows: If ^2 «<. < 0 we accept I/o and Lf > 0 

o-l a»l 



SEQUENTIAL TESTS 


153 


we accept . By thus truncating the sequential process we change, however, 
the probabilities of errors of the first and second kinds. Let a and 7 ? be the 
probabilities of errors of the first and second kinds, respectively, if the sequential 
test is not truncated. Let a(no) and /3(no) be the probabilities of errors of the 
first and second kinds if the test is tiuncatcd at n = no. We shall derive upper 
bounds for a(no) and /3(no). 

First we shall derive an upper bound for a(no). Let po(no) be the probability 
(under the null hypothesis) that the following three conditions are simultaneously 
fulfilled: 

n 

(i) log S < 2^ Za < log A for n = 1, • • , no — 1 

(ii) 0 < 23 Zo < log ^ 

a—1 

(iii) continuing the sequential process beyond no, it terminates with the 
acceptance of ifo • 

It is clear that 

(4.63) «(no) < Of + po(no). 

no 

Let pff(no) be the probability (under the null hypothesis) that 0 < 2/ z« < 

a»l 

log A. Then obviously 

po(no) < po(wo) 

and consequently 

(4 64) a(no) < a + po(no). 

Let pi(n(i) be the probability under the alternative hypothesis that the fol¬ 
lowing three conditions are simultaneously fulfilled: 

n 

(i) log B < 23 Za < log d. for n = 1 , ■ ■ , no — 1 

a“i 


(ii) log B < 23 z. < 0 

(iii) continuing the sequential process beyond no, it terminates with the 
acceptance of Hi. 

It is clear that 

(4.65) ^(no) ^ pi(«o) 

Let pi(no) be the probability (under the alternative hypothesis) that log B < 

no 

23 Za < 0. Then pi(no) < pi(no) and consequently 


( 4 . 66 ) 


d(no) ^ 0 “H pi(ao). 



154 


A. WAI.D 


Let 


Vj 


logj4 — ikEoiz) 


VI 


-noEojz) 

■\/^cro(z) 

~7hEi(z) 


Vn^ 0-0(3) ’ * V^OiC*)’ 

where o,(z) is the standard deviation of z under //,• (1 
(4.67) ^io(«o) = 0{vi) — G{vi) 

and 


Vi 


, loR B ~ rkEiiz) 
0, 1). Then 


(4.68) pi(no) = G(vi) — ff(v»). 

From (4.64), (4.66), (4.67) and (4.68) we obtain 

(4.69) a(no) < « + G(.Vi) — (/(pi) 
and 

(4.70) |S(«o) </3 + ff(^4) - (?(*'»). 

The upper bounds given in (4.69) and (4.70) may considerably exceed a(nt) 
and /3(7 Vo), respectively. It would be desirable to find closer limits. 

Table 3 shows the values of the upper bounds of a(no) and j3(n<i) given by for¬ 
mulas (4.69) and (4.70) corresponding to different pairs (a, j0) ifhd different values 

1 “ fl ^ 

of no, In these calculations we have put log 4 =» log — log B =# log ,- 

a 1 •— a 

and assumed that the distributioiv under Ih is a normal distribution with mean 
zero ard unit variance, and the distribution under Ht is a normal distribution 
with mean 6 and unit variance. For each pair (a, p) the value of d has Iieen 
determined so that the number of observations required by the current most 
powerful test of strength (a, /3) is equal to 1000. 

It seems to the author that the upper limits given in (4.69) and (4.70) are 
considerably above the true a(p<^) and respectively, when no is not much 
higher than the value of n heeded for the current most powerful teat. 

4.7 Efficiency of the sequential prohabthty ratio lest. Ixjt S be any seciuen- 
tial test for which the probability of an error of the first kind is a, the prob¬ 
ability of an error of the second kind is (3 and the probability that the test 
procedure will eventually terminate is one. I^it S' be the sequential prol)- 
ability ratio teat whose strength is equal to that of S. We shall prove that the 
sequential probability ratio test is an optimum test, i.e., that Eiin | S) > 
F,(n I S') (i = 0, 1), if for S' the excess of over log A anej log B can be neg¬ 
lected. This excess is exactly zero if z can take only the values d and —d 
and if log A and log B are integral multiples of d. In any other case the excess 
will not be identically zero. However, if \Ez\ and <rt are sufficiently small, 
the excess of Zn over log A and log B is negligible. 

For any random variable u we shall denote by Ff (u | S) the conditional 
expected value of u under the hypothesis Hi (i = 0, 1) and under the restriction 



SEQUENTIAL TESTS 


155 


that Ho is accepted. Similarly, let E**(u | S) be the conditional expected value 
of u under the hypothesis (i = 0, 1) and under the restriction that Hi is 
accepted. In the notations for these expected values the symbol S stands for 

TABLE 3 


Effect on risks of error of truncating* a sequential analysis at a ■predetermined 

number of trials 


Number of 
trials 

a . 01 and /9 = .01 

a =■ .01 and |3 = .05 

a “ .05 and (3 = .06 

Upper 
bound of 
effective 
a 

Upper 
bound of 
effective 

P 

Upper 
bound of 
effective 

ot 

Upper 
bound of 
effective 
i3 

Upper 
bound of 
effective 
a 

Upper 
bound of 
effective 

P 

1000 

.020 

.020 


■■ 


.095 

1200 

.015 

.015 


■■ 

.082 

.082 

1400 

.013 

.013 


.058 

.072 

.072 

1600 

.012 

.012 

.016 

.055 

.066 

.066 

1800 

.011 

.011 

.014 

.053 

.062 

.062 

2000 

.010 

.010 

.012 

.052 

.058 

.058 

2200 

.010 

.010 

.012 

.051 

.056 

.056 

2400 

.010 

.010 

.011 

.051 

.055 

.055 

2600 

.010 

.010 

.011 

.051 

.053 

.053 

2800 

.010 

.010 

.010 

.050 

.053 

.053 

3000 

.010 

.010 

.010 

.050 

.052 

.052 


^If the sequential analysis is based on the values a and /3 shown, but a deci¬ 
sion is made at no trials even when the normal sequential criteria would require 
a continuation of the process, the realized values of a and /3 will not exceed the 
tabular entries. The table relates to a test of the mean of a normally distributed 
variate, the difference between the null and alternative hypotheses being ad¬ 
justed for each pair (a:,/3) so that the number of trials required by the current 
test is 1000. 


the sequential test used. Denote by Q.(jS) the totality of all samples for which 
the test 5 leads to the acceptance of Hi. Then we have 


(4.71) 

(4.72) 

(4.73) 
and 

(4.74) 



Pi[Qo(S)] ^ 


Poms)] 1 

“ a 

Pi[Qi(5)] 1 

- P 

PolQiiS)] 

ot 

Po[eo(-s)] 1 

— a 

PAQoiS)] 

p 

Po[Qi(S)] _ 

a 


PimS)] 

















156 


A. WALD 


To prove the efficiency of the setpiential probability ratio test, we hIuUI firat 
derive two lemraM. 

Lemma 1. For any random variahle u the incqiudity 
(4.75) < L’c“ 


holds. 

Proof: Inequality (4.75) can be written aa 
(4.78) 1 < Ee."' 

where u' = u — Eu. I..emma 1 ia proved if we ahow that (4,70) holthi for any 
random variable u' with zero mean. lOxpandinf? c"' in a Taylor wriew around 
u' — 0, we obtain 

(4.77) c'*' = 1 + m' + Jk'V'"'’ where 0 < {(«') < u'. 
Hence 

(4.78) Ec"‘ = I + > 1 


and Lemma 1 is proved. 

Lemma 2. Let 8 be a sequential test such that there exists a finite integer N uiith 
the properly that the 7iumhcr n of obscrualions required fur the test u < N. Then 


(4.79) 


Bdn 1 6’) 



(f = 0, 1). 


The proof is omitted, since it is essentially tlie same os that of equation (4,5) 
for the sequential probability ratio tost. 

On the basis of Lemmas 1 and 2 we shall be able to derive the following 
theorem. 

Theorem, Lei S he any sequential lest for vjhxch the prohability of an error 
of the first kind is a, the prohabxlily of an error of the second kind is fi ond the prob¬ 
ability that the test procedure will eventually terminate is equal to one. Then 


(4.80) 


and 


(4.81) 



+ a _ ^) ^ 

r ~ o: cc _ 

Proof: First we shall prove the theorem in the ciist* when there iwists a finite 
integer N such that n never exceeds N. According to Leninia 2 wo have 


(4 82) 





158 


A. T\’AtI> 


Part II. Sequential Test of a Simple or CoMrosiTE IlYPOTHF.Hia Againbx 

A Set of Alternatives 

In Part I we have dealt with the problem of t^'ating a simple hypothesia Ho 
against a single alternative Ih . Hero we shall consider the problem of teflting 
a simple or composite hypothesis against a set of infinitely many alternatives. 
By a simple hypothesis we mean a hypothesm which specifies uniquely the 
probability distribution of the random variable * under consideration. A 
hypothesia is called composite, if it is not simple. 

5, Test of a Simple Hypothesis Against One-sided Alternatives 

6.1. General rmarka. Let /(*, fl) be the probability density function of a 
random variable X, where S is an unknown parameter. Suppose that it is re¬ 
quired to test the simple hypothesis that 5 =» 6o and that the alternative values 
of d are restricted to values d > do. Assume that it is desired to have a sequen¬ 
tial test such that the probability of an error of the first kind b equal to a given a. 

The probability of an error of the second kind is no longer a single value, but 
b a function of the true value of fl. If f{x, 0) is a continuous function of x and 
d, the probability of an error of the second kind will be arbitrarily near 1 — a 
if the true value of d is sufficiently near Oo. Hence, if a b small, the prob¬ 
ability of an error of the se.cond kind is necessarily large when the true value of d 
is very near do. In most practical applications we do not care if the prob¬ 
ability of an error of the second kind is high when the true value of 0 b very 
near do, since in this cose the error committed by accepting do b usually of very 
little importance. However, there will be a value 0i > do such that we wish the 
probability of an error of the second kind to be less than or equal to a given small 
positive value /3 whenever the true value of fl b greater than or equal to 6i . 

In this case we can proceed as follows; Consider the single alternative hypothe¬ 
sis Hi that d = d\. Construct a sequential test for testing 0 = flo against the 
single alternative Bx such that the probability of an error of the first kind b a 
and the probability of an error of the second kind, i.e., the probability of ac¬ 
cepting do when fli is true, b A If thb sequential test has the further property 
that the probability of an error of the second kind is less than or equal to d 
whenever the true value of d is greater than dx , then thb sequential test pro¬ 
vides a satisfactory solution of the problem of testing the hypothesis that 0 » fio 
against the set of alternatives 6 > do, 

In most of the important cases occurring in practice, such os when X has a 
normal, binomial, or Poisson distribution, etc., the sequential probability ratio 
test for testing the hypothesis that d = do against a single alternative dx {dx > do) 
satisfies the condition that the probability of an error of the second kind is a 
monotonically decreasing function of 0 in the domam e> do. Thus, in all these 
cases the sequential probability ratio test for testing the hypothesb that 0 = 0o 
agamst a properly chosen alternative dx provides a satisfactory solution of our 
problem. 



SEQUENTIAL TESTS 


169 


The case in which the alternative values of 6 are restricted to values less than 
Bq is entirely analogous to that in which the alternatives are restricted to values 
greater than Bt, and need not be discussed separately. 

It should be pointed out that the test procedure for testing 0 = Bo against 
alternatives 6 > Bo, b& described in this section, is also suitable for testing the 
composite hypothesis that B < Bq, provided that the probability of rejecting 
the null hypothesis is < a whenever the true value of 0 is < Bo. This condi¬ 
tion is fulfilled, for instance, when X has a normal, binomial or Poisson distribu¬ 
tion. 

5.2. Application to hinomal distributions. 5.2.1. Statement of the problem. 
The case of a binomial distribution arises when the result of a single observa¬ 
tion is a classification into one of two categories. For example, this is the 
situation in acceptance inspection of manufactured products, if each unit 
inspected is classified into one of the two categories, non-defective and defective. 
Let p denote the probability that an item belongs to a given category. The 
value of p is usually unknown. We shall deal here with the problem of testing 
the hypothesis that p does not exceed a given value p' against the alternative 
possibility that p > p'. 

Since acceptance inspection of manufactured products is perhaps the most 
important and widest field of application of such a teat procedure, we shall, in 
continuing the discussion, use the terminology of acceptance inspection. This, 
of course, does not mean that the test procedure is not applicable to other 
cases. Suppose that a lot containing a large number of units is submitted for 
sampling inspection. Let p denote the proportion of defective units contained 
in the lot. The probability that a unit drawn at random from the lot will be 
defective is equal to p. If m units are drawn at random from the lot, the prob¬ 
ability that there will be d defectives among them is given by'’ 

The probability distribution as given in (5.1) is called a binomial distribution. 

The purpose of sampUng inspection is to decide whether the lot should be 
accepted or rejected. It is clear that for high values of p we want to reject the 
lot and for low values of p we want to accept the lot. Thus, it will be possible 
to specify a particular value of p, say p', so that if p < p' we wish to accept the 
lot, and if p > p' we wish to reject the lot. Thus, our problem is to devise a 
proper sampling inspection plan for testing the hypothesis that p < p'. 

5.2.2. Tolerated risks for making a wrong decision. No sampling inspection 
plan can guarantee that the correct decision will always be made, i.e., that the 
lot ivill always be accepted when p ^ p' and the lot will always be rejectted when 
p > f', unless the lot is inspected completely. A complete inspection is usually 

Formula (6.1) is exact only it the lot contains infinitely many units. While the lot is 
always finite in practice, we shall assume that m is small as compared with the lot size so 
that formula (6.1) can be used. 



160 


A. WAi.n 


rather uneconomical and one la willing to take mme riak of Innking a wrong 
decision if thi^s permits a reduction in the amount of inapwiion. Hence, reeom- 
mendalions as to the projxir choice of a sampling inapection plan can he made 
only after the risks that can Ire tolerated have Ireen stated. 

If p is equal to the marginal value p\ we may say that it is indifferent to ua 
whether the lot is accepted or rejected. If p < p' we prefer acceptance and 
this preference is the stronger the smaller p. .Similarly, if p > p' we prefer 
rejection of the lot luid this pi-ofcrence increase's an p increases. Thus, it will 
be possible to .select a value po < p' and a value pi > p' such that the error is 
considered senous only if we accept the lot when p > jh, or we reject the lot 
when p < po ■ 

After the two values po and pi have been selected the risks that we are willing 
to tolerate may reasonably be stated as follows; a sampling insiX'Ction plan is 
required such that the probability of rejecting the lot. is less than or cepial to a 
preassigned value a whenever p < po, and the probability of accetffing tin* lot 
is less than or equal to a preussigned value d whenever p > pj. Thus, the 
tolerated risks are characterized by the four (juanlities po , Pt, a and The 
proper sampling plan can be determined after these four (pmntitie.s have been 
chosen. 

5.2 3. The sequential prohability ratio test corrrsponding (o thr qmntUics po, 
Pi, a and )3. Let IIo be the hypothesis that p » po and I/j the hypothesis that 
p = P\. Consider the seciuential probability ratio test T for testing //o against 
Hi for which a is the probability of accepting Hi when //o is true (error of the 
first kind) and is the probability of acerspting //«when Hi m true (error of the 
second kind). This probability ratio test will satisfy all ovir reqviirements, since 
for this test the probability of accepting the lot (accepting Ih) i.s <d whenever 
p > pi and the probability of rejecting the lot (accepting Hi) is <« whenever 
P < Po. 

According to formulas (3.8), (3.9), (3.10) and wetion 3.3 the .stiquential test 
T is given as follows: At each stage of the inspection, at the m-th observation 
for each integral value of m, calculate the quantity 


(5.2) 


» pt-d - pQ"- "" 

pom p^-d - 


(m “ 1, 2, > ■') 


where d„, denotes the number of defectives found in the first w units inspected. 
Reject the lot (accept Hi) if 


(5.3) 

Accept the lot if 


Si? > ^ ~ 

POm Of 


Pirn < 0 

pOm “ 1 — a' 


(5.4) 



SEQUENTIAL TESTS 


161 


Take an. additional observation 
(5.5) 


0 < Els 

1 — a po™ 


< 


1 - 


For the purpose of practical computations it is useful to rewrite the inequalities 
(5.3), (5 4) and (5.5) in a somewhat different form Taking the logarithms of 
both sides of the inequahties (5.3), (5.4) and (5 5) one can easily verify that 
these inequalities are equivalent to 


(5.6) 


(5.7) 


and 


log 


1 - /3 


> 


log 


log — - log ^ 


+ m ■ 


1 — Po 

1 - Pi 


Po 


1 - Po 


log — - log 


log 


dm < 


1 - 


po 

log 


logE-logl-f 


+ m ■ 


1 - Po 

1 - Po 
1 - Pi 


Po 


1 - Po 


log — - log 


Po 


1 - pi 

1 - Po 


log 


1 — a 


log 


(6 8 ) 


log^' 

Po 


+ m 


1 - Po 

1 - pi 


log 


logSi-loglUP! 

1 - Po po 1 - Po 


< dm < 


log 


1 - p 


log 


log — - log ^ 


+ m 


1 - Po 
1 - Pi 


Po 


1 - Po 


log — - log ^ ^ 


Po 


1 - po 


Using the inequalities (5 6), (5 7) and (5.8) the test procedure can easily be 
carried out as follows: For each m we compute the acceptance number 


(5.9) 


l4m 


log 


P 


1 — a 


log 


Po 

and the rejection number 


log ^ - log ^ 


+ m ■ 


1 — Po 
1 - Pi 


1 - po 


log Hi - log i-E.1 

po 1 — Po 


(5.10) 


Rm = 


log 


1 - p 


log 


log Hi - log —Hi 
Po 1 — Po 


+ m 


1 - P o 

1 - pi 


log ^ - log ^ 


Po 


Pi 

1 - po 


"There is a slight approximation involved in the formulas (5,3), (5.4) and (5.6). For 
details see section 3 3 



162 


A. WAIiD 


These acceptance mimbera Am and rejection niim{>ere Rm are Wat tai)ulated 
before inspection starts. Insjxiction is continued iUi long as .4* < dm < Rm. 
At the first time when dm does not lie between the acwptance and rejection 
numbers, the sampling inspection is terminatf»d. The lot is acecpted if d*, < 
and the lot la rejected if dm > Rm . 

The test procedure can also be carried out graphically im indicated in Figure 2. 
The number m of observations mode ia measimnl along the nbsciRsa axis. Since 
Am is a linear function of m, the points (m, Am) will lie on a straight line La. 
Similarly, the points (in, Rm) will He on a straight line Li. We draw the lines 
Lo and Li and the points (m, dm) arc plotted as insixiclton goes on. At the first 
time when the point (m, d„) does not He !>etwcen the lines and Lj inspection 



is terminated. The lot is rejected if the point (m, d„) lies on fii or above, and the 
lot is accepted if the point (m, d„) lies on Lo or below. 

5.2.4. The operaling characteristic curve of the test. As mentioned in section 
5.2.3 the test procedure defined by the inequalities (6.6), (5.7) and (6.8) will 
satisfy the requirement that the probability of accepting the lot is < /3 when¬ 
ever p > pi and the probability of rejecting the lot is whenever p < po. 
Although this already describes the essential features of the test procedure, it 
may be desirable to know the probability Lp of accepting the lot for any possible 
value p of the proportion of defectives in the lot. Clearly, L, will be a function 
of p and can W plotted as shown in Figure 3, The curve Lp ia called the operat¬ 
ing characteristic curve. The range of p is, of course, from 0 to L Lp =■ 1 
ter p = 0 and L, = 0 for p = ]. The value of Lp decreases as p increases. 
We already know that Lpj = 1 — « and L,j s= j3. Now we shall give a method 



SEQUENTIAL TESTS 


163 


for computing the value of L, for any p. If pi is not far from po, which will 
usually be the case in practice, a good approximation to is given by (see 
equation 3,35) 


(6.11) Lp ~ 1 - 




where h is equal to the non-zero root of the equation 

(W2) p(&)‘+„-rt(^y.i. 



To plot the operating characteristic curve, it is not necessary to solve (5.12) 
with respect to h. Instead we can proceed as follows: From (6.12) we express 
p as a function of h, i.e., 


(5.13) 



For any given value h we compute the value of p from (5.13) and the value of 
Lp from (6.11). The point (p, Lp) obtained in this way will be a point of the 
operating characteristic curve. Doing this for various values of h we can 
obtain a sufficient number of points on the operating characteristic curve so 
that the curve can be drawn. 



164 


A. WAl/D 


5.2.5. The average amouni of inspection required hy the test. Denote* by Ep{n} 
the expected value of the number of observatioius reciuired by the b‘ht. Clearly, 
Ep(n) IS a function of p. According to (4.8) a good approximation to the value 
of Ep(n) 13 given by 

Lp log + (1 - Lp) log '^ 

(5.14) Epin) -^.- , 

p log^‘ + (1 - p) log - 

where Lp is given by (5.11). Plotting Ep{n) as a function of p, the curve obtained 
will, in general, be of the type ahoivn in Fig. 4. The maximum will ordinarily 
be reached between pa and pi. Furthermore, the cunx* will, in general, be 
increasing as p mcreases from 0 to po, and decreasing as p inorcamxs from pi 
to 1. 



5 3. Sequential analysis of dovhh didwlomica. 5.3,1. Formulaiim of the 
problem. Suppose that we want to compare the effectiveneas of two pnaluction 
processes where the effectivene.ss of a production process is meiisured in terms 
of the proportion of effective units in the sequence produced. We shall say that 
a unit is effective if it has a certain desirable property, for example, if it with¬ 
stands a certam strain Let pi be the proportion of effectives if process 1 is 
used, and pa the proportion of effectives if procoas 2 i.s used. In other words, 
pi is the probability that a unit produced will Ixi effective if process 1 is used, 
and Pi IS the probability that a unit produced will be effective if process 2 is 
used. Suppose that the manufacturer docs not know the values of pi and pi, 
and that process 1 is in operation. If Pi > pi, then the manufacturer wanks to 
retain process 1. However, if pi < pi, especially if'pi is substantially smaller 
than Pi , the manufacturer would like.to replace process 1 by process 2. Thus, 
we are interested in testing the hypothesis that pi > pi against the alternative 
that Pi < Pi. 

A more general formulation of the problem can be given as follows: Consider 
two binomial distributions. Let pi be the probability of a success in a single 



SEQUENTIAL TESTS 


165 


trial according to the first binomial distribution, and let p 2 be the probability 
of a success in a single trial according to the second binomial distribution. 
We shall use the symbol 1 for success and the symbol 0 for failure, Suppose 
that the probabilities pi and pj are unknown. We consider the problem of test¬ 
ing the hypothesis that pi > pi on the basis of a sample consisting of Ni observa¬ 
tions from the first binomial distribution and Wj observations from the second 
binomial population. Since in many experiments the case Ni = Wi is mainly 
of interest, and since this caae (as we shall see later) makes an exact and sim¬ 
plified mathematical treatment of the problem possible, we shall assume iij what 
follows that iVi = Ni = N (.say). 

Thus, on the basis of the outcome of the two series of N independent trials 
we have to decide whether the hypothesis pi > pj should be accepted or rejected 
5.3.2. The classical method. The classical solution of the problem for large N 
is given as follows: Ixit S], be the number of auccesse.s in the first set of N trials 
(drawn from the first binomial population), and let Si be the number of suc¬ 
cesses in the second set of N trials (drawn from the second binomial population). 

Denote by p and 1 — p by q. Then for large N the expression 

(5.16) 

is normally distributed with zero mean and unit variance if pi = ps. Suppose 
that the level of significance wo wish to choose is a. Let Xa lie the value for 
which the probability that a normal variate with zero mean and unit variance 
will exceed \a is equal to a. (For example, if « = .05, \c = 1.04), Thu.s, if 
Pi = Ps, the probability that the expression (5.15) will exceed X« is equal to a. 
If pi > p 2 , the probability that the expression (5.15) will exceed Xa is les.s than a. 
According to the classical method the hypothesis that pi > pj is rejected if the 
observed value of (5.15) exceeds Xo. This method involves an approximation. 
The distribution of the expression (5.15) is not exactly normal even for large N. 
For small N this method cannot be used, since the distribution of (5 15) is far 
from normal. For small N, R. A. Fisher has propo.sed an exact method which, 
however, involves cumbersome calculations. In section 5.3.3. we shall sugge.st 
another method which is exact (does not involve any approximations) and i.s 
simple to apply as far as computations are concerned. The latter method havS 
the further advantage of being suitable for sequential analysis to which existing 
methods are not readily adaptable. 

5 3.3. An exact method. Let ai, • • • , Oiv bo the results in the fimt set of N 
trials, and 6i, • • • , fjy the re.sult8 in the second set of N trials. These resuU.s are 
arranged in the order observed. Consider the sequence of N pairs 

(5.16) (m, bi), ■ •' , (Uff, b/f), 

Let <i be the number of pairs (1, 0) and h the number of paim (0, 1) in thus 
sequence. We consider only the pairs (0,1) and (1, 0) and hose the te.st on them. 




166 


A. WAIjD 


Let a be the outcome of an observation from the first population, and b the 
outsome of an observation from the second population. The probability that 
(a, b) = (1, 0) is equal to pi(l - p,), and the probal)ility that (a, b) — (0, 1) is 
equal to (1 - pi)pj. Hence, knowing that (a, b) is equal to one of the pans 
(0,1) and (1,0), the (conditional) probability that it is equal to (0,1) is given by 


(5.17) 


(I - pOPi __ 
Pi(l - Vt) +~^(i “ Pij' 


and the (conditional) probability that it is equal to (1, 0) is given by 


(6.18) 


1 - p 


Pi(l - pi) 


Pi(l ~ Pi) + (1 ~ pj)pj' 


Hence, considering only the pairs (1, 0) and (0,1) the variate i, is distributed like 
the number of successes in a sequence oU=^ k + k independent trials, the prob¬ 
ability of a success in a single trial being equal to p. One can easily verify that 
p = ^ if pi = pj , p < J if pj > pi and p > Pi < pi. 'Phus, the hypothesis 
to be tested, i.e., the hypothesis that pi ^ pj, is ecjuivalent to the hypothesis 
that p < i Thus, we can test the hypothesis that pi > p, by testing the 
hypothesis that p < i on the basis of the observed value of k . Since, the dis¬ 
tribution of b is the same as the distribution of the numlier of succeascs in ( » t, -}- 
k independent trials (t is treated as a constant and the probability of a suceess 
in a single trial is equal to p), the test procedure can be carried out in the usual 
manner. If we want a level of significance a, a critical value T is choacn so that 
for p = i the probabiliy that k > Tk equal to «. The hyixithesls that p < i 
is rejected if and only if the observed k is greater than or equal to the critical 
value T. The value of T can be obtained from a table of the binomial distribu¬ 
tion, If t is large, k ia nearly normally distributed and the critical value T can 
be obtained from a table of the normal distribution. 

This procedure thus provides a simple test of the hypothesis that pi > pj. 
The question arises whether the efficiency of this method is as high as that of the 
classical method. It would seem that the method suggested here cannot he a 
most efficient procedure, since the values of k and k depend on the order of the 
elements in the sequences (oi, , a^) and (5i, • ■ ■ , b«), and there is no 
particular reason to arrange them in the order observed. Howev'er, it has been 
shown in [7] that the loss in efficiency as compared with the classical method is 
negligible if the number N of trials is large.’* 

It should be pointed out that the procedure for testing the hyjiothesis that 
Pi > Pi can be used also for testing the hypothesis that pi « pj it the alternative 
hypotheses are restricted to pj > pi, 

In addition to simplicity and exactness the present method seems superior to 
the classical one in the following respect: Suppose that (contrary to the original 

believes that the loss in efficiency is slight even when .V ia small, although 
no exact investigation of this case has been made. 



SEQUENTIAL TESTS 


167 


asaumption) tlie probability of a success varies from trial to trial. Denote by 
the probability of success in the i-th trial of tbe first set, and by pj’’ the prob¬ 
ability of success in the i-th trial in the second set (i = 1, • ■ ■ , TV). Assume that 
that the probabilities pi'^ and pj'^ are entirely untknown and we wish to test the 
hypothesis that pi'^ — p"’ = ... = =, 0. In this case the classical 

method is not applicable, but the present method provides a correct procedure. 
Such a situation may arise, for instance, if we want to test the hypothesis that 
the probability of a success (hitting the target) is the same for two different guns. 
In the course of the experiments the probability of a hit may change due to ex¬ 
ternal conditions such as wind, disposition of the gunner, etc. However, these 
external conditions are likely to affect both guns equally if the trials are made 
alternately (or approximately alternately), so that if the two guns are equally 
good we have p,'*’ = p^’’ (t = 1, • • * , N). 

5 3.4. Seguenticd leal of Ihe hypothesis that pi > p*. In order to devise a proper 
sequential test for testing the hypothesis that pi > ps, we have to state first 
what risks of making wrong decisions we are willing to tolerate. The efficiency 
of the production process 1 may be measured by the ratio of effectives to in- 

effectivea produced, i.e., by ■ . Production process 1 may be regarded 

1 — Pi 

the more efficient the larger the value of fci. Similarly, the efficiency of produc¬ 
tion process 2 may be measured by Aj => — . The relative superiority of 

1 — pj 

production process 2 over the process 1 can then reasonably bo measured by the 
ratio of fci to fci i.e., by 


(5.19) 


_ kt p,(l - pi) 
*1 pi(l - pj) 


If u = 1, the two processes are equally good. If w > 1, process 2 is superior to 
process 1, and if u < 1, process 1 is superior to process 2. Thus, the manu¬ 
facturer will, in general, be able to select two values of u, tto and ui say (xio < Ui) 
such that the rejection of process 1 in favor of process 2 is considered an error of 
practical importance whenever the true value of u < «o, and the maintainance 
of process 1 is considered an error of practical importance whenever u > . 

If u lies between wo and iq, the manufacturer does not care particularly which 
decision is taken. 

Clearly, we will always have xh < Ui. If the transition from production 
process 1- to process 2 involves some cost or other inconveniences, it seems 
reasonable to put u® — 1 (or u® may even be slightly greater than one). This 
choice of uo really means that we consider the rejection of process 1 a serious error 
whenever this process is not inferior to process 2. On the other hand, if the 
transition from process 1 to process 2 does not involve any inconveniences, the 
rejection of process 1 in favor of 2 cannot be a serious error when the two processes 
are equally efficient, i.e., when « = 1. Thus, in such a case, it seems reasonable 
to choose uo somewhat below 1. 



168 


A. WALD 


After the quantities uo and Wi have, iieen dioseii the riskn that we are willing 
to tolerate may reasonably be expressed in the following form: The probability 
of rejecting process 1 should not exceed a preassigned value a whenever u < uq , 
and the probability of maintaining process 1 should not exceed a preassigned 
value ^ whenever w > Ui. 

Thus, the risks that we are wiiling to tolerate arc characU'riw'd by the tour 
quantities Wo, Ui, a and )3. After these four quantities have hwn chosen, a 
proper sequential test can be carried out as follows: The (conditional) prol>- 
ability that we obtain a pair (0,1), as given in (5.17), can be expressed os a func¬ 
tion of w. In fact 


(5.20) 


pi(l ~ Pi) + Pad - Pi) 


(1 - Vi)pi 
Pi( i - Pi) 

Pi(l - Ps) 


U 

1 -b li' 


Let Ho denote the hypothesis that p = ^— , and Hi the hypothesia that 
P = proper sequential test satisfying our recpiirements concerning 

1 + Ui 

tolerated risks is the sequential probability ratio test of Ha against Hi , The 
acceptance and rejection numbers for this sequential test can lie obtained from 

(5.9) and (5.10) by substituting for po --4- ■ for pi and f ®* h + h for m. 

1 -h Uc 1 T Ui 

Thus, for each value of I the acceptance number is given by 


(5.21) 


A, - 


log 


5 


1 — a 


log 


log Ui — log Mo 
and the’rejection number is given by 


+ t 


log Ml 


j- _+ M«__ 

log Mo 


(5 22) 


Ri 



_ a 

log Ml — log Uq 


+ i 


log 


1 +M l 
1 + Mo 


log Ml — log Mo 


These acceptance numbers At and rejection numbers Rtit = 1, 2, • • ■ ) are best 
tabulated before experimentation starts. The sequential test is then carried out 
as follows; The observations are taken in pairs where each pair consists of an 
observation from the first process and an observation from the second process. 
We continue taking pairs as long oa Ai < k < Rt. At the firat time when k 
does not lie between the acceptance and rejection numbem, exjxirimentatkm is 
terminated. Process 1 is maintained if at this final stage h < A,, and process 1 
is rejected in favor of 2 if h > 22;. 

The test procedure can also be carried out graphically as shown in Figure 5. 
The total number m of pairs (0, 1) and (1, 0) is measured along the horizontal 
axis. The points (i, A;) will lie on a straight line Lo, since A; is a linear function 
of i. The points (i, 72i) will lie on a parallel fine Li. We draw the lines Lo and 




SEQUENTIAL TESTS 


169 


Li and plot the points ( 1 , 12 ) ns experimentation goes on At the first time when 
the point {t, is not within the lines Lo and Li experimentation is terminated. 
Process 1 is maintained if at the final stage the point (t, U) lies on Lo or below, 
and process 1 is rejected if the point (t, (*) lies on Lj or above. 

5.3.6. The operating characterislic curve of the test. For any value u of the ratio 
ki 

^ we shall denote by Lu the probability of maintaining process 1. Clearly, Lu 

is a function of u. This function Lu is called the operating characteristic curve 
of the test. The operating characteristic curve can be determined from the 

equations (6.11) and (5.13) by substituting -—^— for pi and — for po . 

1 "T" Ui 1 "pU 0 




170 


A, WALD 


For any given value h we compute the values of u and Lu from, these equations. 
The point (it, L„) obtained in this way will be a point of the operating cliaracter- 
istic curve. Calculating the points (v, I/«) for & sufficiently large number of 
values of h we can draw the operating characteristic curve. 

5.3.6. The avefage amount of inspection required by the test. For any value u 

hi 

of the ratio r denote by Eu{i) the expected value of the total number of pairs 

ki 

(0,1) and (1, 0) required by the test. The value of E^it) can be obtained from 

til ^ 

(5.14) by substituting Eu(,l) for Efin), Lu for L,, rTTT *^or pi and for 

i -f* 1*1 J. -f 1*0 

Po. Thus 


(6.25) 


L, log .-1- + (1 - L„) log i-^ 

l — a Of 

u . Ui(l d~ up) _1__ , 1 + np 

1 -H w Ufld -h «i) 1 + «1 4- «i 


To compute the expected value of the total number of pairs (including also 
the pairs (0, 0) and (1,1)), we merely have to divide the right side expression in 
(5.26) by pi(l - Pi) + pi(l - pi). 

In the rare event that no decision is yet reached at a number of pairs equal to 
three times the expected value, we can truncate the test at that stage -without 
seriously affecting the probabilities of makmg a wrong decision (see section 4.6 
in Part I). 

6.3.7. Observations made in groups of r. In applications it may happen that at 
each stage in the sequential process instead of drawing a single observation we 
draw r observations from each of the binomial distributions. Hence, instead of 
a single pair, we have two sets of r observations. If the order of observations 
in each such set of r is recorded, we can establish the number of pairs (0,1) and 
the number of pairs (1,0) for each pwr of sets of r observations. In such a case 
the test can be carried out as described in section 6.3.4, since after each pair of 
sets of r observations we can compute t and It . The only effect of taking the 
observations in groups of r is that more observations will generally be necessary 
(approximately enough to fill out a group) and thereby the probability of making 
an incorrect decision will be made somewhat smaller. However, if the order of 
observations in such groups of r is not recorded, the difficulty arises that we are 
not able to determine the values of t and 1% needed for the test procedure. It has 
been shown in [7] that in such a case we may replace (and 1% by certain estimates 
of t and li without affecting seriously the probability of making an inconeot 
decision. The estimates of and k (and thereby also on estimate of (h + h) 
are obtained as follows: Let Vi be the number of successes in the group of t ob¬ 
servations drawn from the first binomial distribution, and let r* be the number 
of successes in the group of r obsei^'ations drawn from the second binomial distri¬ 
bution. Then for this pair of groups of r observations, we estimate the number 



SEQUENTIAL TESTS 


171 


of pairs (1, 0) to be H — — and the number of pairs (0,1) to be r? — — . Thus, 

r r 

an estimate of h is obtained by summing n — — over all pairs of groups ob- 

r 

served, and that of h is obtained by summing rj-- over all pairs of groups 

T 

observed. 

5.4. Applicalton to testing the mean of a normal distribution vnlh known standr 
ard deviaiion. 5.4.1. Formulation of the problem. Suppose that a measurable 
quantity x is normally distributed with unknown mean 6 and known standard 
deviation c. For example, x may be some measurable quality characteristic 
of a unit of a certain product where x is normally distributed with a known 
standard deviation in the population of all units. The problem we shall con¬ 
sider here is to test the hypothesis that the unknown mean 6 is less than a specified 
value 6'. This problem arises frequently, for example, in quality control. 
Suppose that the quality of the product is considered the better the higher the 
mean value of x. Thus, there will be a value 8' such that the product is con¬ 
sidered sub-standard if 8 < 8' and the product is considered to meet specifications 
if > 8'. Since 6 is unknown, we are usually interested in testing the hypothesis 
that 8 < 6', i.e., that the product is sub-standard. 

Since quality control is an important field of application for such test proce¬ 
dures, the discussion will be continued in the terminology of quality control. 
This, of course, should not be mterpreted as a restriction upon the general 
validity and applicability of the test procedure. The problem treated in section 
6.4 can now be stated as follows: Let a; be a measurable quality characteristic 
of a unit of a certain product. The variable x is supposed to be normally 
distributed with known standard deviation in the population of all units pro¬ 
duced. The problem is to devise a sampling plan for testing the hypothesis 
that the product is sub-standard. The product is said to be sub-standard, if 
the mean 0 of a: is less than a given specified value 6'. 

5.4.2. Tolerated nsks for making a wrong decision. No sampling plan can 
guarantee that the correct decision will always be miade, i e., that the product 
will be declared sub-standard if and only if < 6'. The larger the amount’of 
inspection, the smaller we can make the risks for making a wrong decision. If 
inspection is costly, or destructive, we are willing to tolerate some risks of making 
wrong decisions in order to reduce the necessary amount of inspection. Thus, 
a proper sampling plan can be recommended only after the risks that can be 
tolerated have been stated. 

If the quality of the product is exactly on the margin, i.e., if 0 = 0', then it 
will make little difference whether the product is classified as sub-standard or 
not. However, if 6 is considerably smaller than 8', then the acceptance of the 
hypothesis that the product meets specifications (rejection of the hyiiothesis 
that the product is sub-standard) will usually be considered as a seiious error. 



172 


A, WAIiD 


Similarly, if 6 is much larger than e\ the acceptance of the hypothesis that the 
product IS sub-standard will generally be considered as a serious error. ITiiis, 
the manufacturer will, m general, be able to select two values of $, Qq and 0i say 
(00 < 0' and 01 > O') such that the classification of the product as satisfactory 
(meeting specifications) is considered an error of practical importance whenever 
6 < 00 , and the classification of the product as sub-standard is con.sideml an 
error of practical importance whenever 0 > Bi. If 0 lies bet.wcen 0o and 9t, a 
wrong classification of the product will not l>o viewed as a serious error, since 
in this case 0 is near the marginal value 0', 

After the two values do and di have lieen selected, the risks that we arc willing 
to tolerate can be stated in the following form: A sampling plan is required 
such that the probability of classifying the product as satisfactory is leas than 
or equal to a preassigned quantity a whenever 0 < 0o, and such that the prolv 
ability of classifying the product as sub-standard is less than or equal to a 
preassigned quantity /? whenever 5 > fli. Thus, the tolerated risks are char¬ 
acterized by the four quantities da, 0i, a and /3. A proiier sampling plan can 
be devised after these four quantities have been selected. 

5.4.3. A sequmiial lest of the hypothesis thoi 0 < O' (the product is substandard). 
Let ffo be the hypothesis that 0 — Oo and let Ih be the hypothesis that (1 = . 

Let T be the sequential probability ratio tc.st for testing Ho against Hi such that 
a is the probability of accepting Hi when Ha is true and (3 is the probability of 
accepting IIo when Hi is true. This sequential tost will satisfy all our require¬ 
ments, since for this test the probability of accepting Ho (declaring the product 
as sub-.standard) is < /3 whenever 6 > di, and the probability of accepting Hi 
(declaring the product as satisfactory) is < a whenever 6 < 8a ■ 

The sequential test T i.s given as follows; Denote the successive observations 
on a: by * 1 , zj, ■ ■ • , etc. Accept the hypothesis that the product is satisfactory 
at the m-th observation if 

-(1/2'’) S 

(5 20) log ^- - -> log . 

-0/2'’) f (*.-94)2 “ 

C a-l 

Accept the hypothesi.s that the pioduct is sub-standard if 

-0/2'’) S (».-»i)‘ 

(5.27) log ^-- < log -- — 

-0/2'’) £ ^ ~ 

6 o*“l 

Take an additional observation if 

-0/2'’) £ (*»-9|)’ 

(5.28) log ^ < log -^- < log 

1 — a " ^ a 

-(»/"'’) S («»-9ii)’ 

B 1 



SEQUENTIAL TESTS 


173 


The inequalities (5.26), (5.27) and (5.28) are equivalent to 


(5.29) 


m 2 t /□ 

£ ^ log + 


Oa + 0i 


(5.30) 


X) < 


log - -h m 

1 — a 


(5.31) 


O' , fi 6 q 6i 

ir^. 1 — + “ —r- 


A <r’ , 1 - jS , 

< Z *« < -z log-^ + m 

a-l Ol — Do « 


respectively. 

Using the inequalities (5.29), (5.30) and (5.31) the test procedure can easily 
be carried out as follows: For each m compute the acceptance number 


(5.32) 


■^m — 


and the rejection number 


(5.33) 


Rm = 


1 ~ ^ I ~ ^0 + 01 

log - + m —^— . 

I a I 


These acceptance numbers Am and rejection numbers Rm are best tabulated 

m 

before inspection starts. Inspection is continued as long as ylm < Z < 

/ a—1 
m 

Rm . At the first time that X) does not lie between Av, and Rm , in-spcction 

a"»l 

m 

IS terminated. If at this final stage Z m > I'l^o hypothe.sis that the 

a«»l 

m 

product is sub-standard is accepted, and if X ^ R^ , the hypothesis that 

a-I 

the product is sub-standard ,is rejected 

The test procedure can also be carried out giaphically as shown in Figure 6. 
The number m of observationiS is measured along the liorizontal axis. The 
points (m, Am) will lie in a straight line Lo and the points (m, Rm) will lie on a 
parallel line L, . We draw the iiarallcl linos La and Li and plot the points 

( m, X ) as in.spection goes on. At the finst time when the point ( m, X *« ) 

does not he between the lines Lo and Li inspection is terminated, The hypothi'- 

sis that the product is sub-,standard is rejected if the point X Ri 

or above. The hypotliesis in question i.s accepted if tlie point X 
lies on Lo or below. 



174 


A. WALD 


5.4.4. The operating characlerisiic curve of the lest. For any value 6 denote by 
L) the probability that the hypotheaia that the product is sub-standard is 
accepted. Obviously, La will be a function of 6 and is called the operating 
characteristic curve of the test. The shape of the operating characteristic curve 
will, in general, be of the type shown in Figure 7. La approaches 1 as 6 —— m 
and La approaches zero as —>■ «>. Furthermore, La is a decreasing function 
of 0. We already know the values of La for 0 — 8a and fl = . Now we shall 

give a method for c'oraputing the value of La for any 6. If -I-? is fairly small. 



which will usually be the case in practice, a good approximation to is given 
by (see equation 3.35) 


(5.34) L, 1 



n - A 

Vi ~’ 

^ « J 




where the constant h is determined as follows: First we compute the character¬ 
istic function <p(i) of the variate 


e 1 

= — T—, = ^ [2(0i - do)x -h SS - 


(5.35) 



SEQUENTIAL TESTS 


175 


^ 6^ (d 

Thus, z is normally distributed with mean = ^ 


(gi - 
<r* 

(5.36) 


2o^ ' ( 7 ^ 

Consequently, <p{t) is given by 

r «o-«i . I . 

^(t) = . 


and variance = 


The value h is the non-zero real root of the equation = 1. Hence 

/E o-rs 1, “ ^o) ~ 2(01 — 0o)0 01 -|- 00 — 20 

*-- 

The operating characteristic curve can be computed from (5.34) substituting 
the right hand side member of (5.37) for h. 

5.4.5. The average amount o/ inspection regutred by the test. Let £'j(n) denote 
the expected value of the number of observations required by the test when 0 



is the true mean of x. According lo (4.8) a good approximation to the value of 
Ei(n) is given by 

Eein) - ~ 

el - el + 2(01 - 0o)0 

where L« is given by (5.34) 

In the rare event that the number of observations roaches three times the 
expected value before the teat is terminated, we can truncate the test at this 
stage without seriously affecting the probabilities of making a wrong decision. 
(See section 4.6 in Part I). 


6. Outline of a General Theory of Sequential Tests of Hypotheses when No 
Restrictions Are Imposed on the Alternative Values of the Unknown 

Parameters 

6.1. Sequential test of a simple hypothesis with no resinciions on the altemaiive 
values of the unknown parameters. Consider the following general case. Let 



176 


A. WALD 


Xi, ■ • , Xp be a set of p random variables and let /(xi , ■ •' , Xj,, Oi, • • , dk) 
be the joint probability density function of these random variables involving k 
unknown parameters Bi, ■ ■ ■ , dk . Suppose that we wish to test the hyi>othcsis 
Tfo that 01 = fl?, • ■ , 0fc = 0° , where 0? , • • ■ , 0* are some given sixieified values. 
Denote the set of all a priori possible parameter points by U. Assume that S2 
contains at least a finite fc-dimensionnl sphere with the center (0i, ••• , 0*). 
Let SI* be the set of all possible alternative parametiw points; i.e., SI* is the 
whole parameter space fi with the exception of the point 0“ = (0?, • • • , 0*). 
For any statistical procedure for testing //o, the probability of an error of the 
first kind, will have a definite value, but the probability of an error of the second 
kind will depend on the true alternative; i.e., it will bo a single valued function 
/3(0) defined over all points 0 of fl*. Let w{B) be Some non-negative function, 

called weight function, such that / ta(0) d6 = 1. Suppose that we wi.sh to 

J Q* 

construct a sequential test such that the probability of an error of the first kind 
is equal to a given a and that the weighted average / w(0)j3(0) d(0) of the 

Jii* 

probabilities of errors of the second kind is equal to some given positive value /3, 
This problem can easily be solved as follows: Ixit pon be equal to the product 

n f{xi„ , • • • , Xpa , 0?, • • • , 0*) where x,a denotes the «th ob.servation on 

xi {i = 1, • ■ • , p; a = l, ■ • - ,n). Furthermore, let pu be defined by 

(6.1) Pin = w{e) l^n/(aia. . ., 01 , ..., 0k)] de. 

The expression pi„ can be interpreted as the probability den.sity in the sample 
space of n observations on the variates Xi, • • • , Xp , if we assume that the 
parameter point 0 in £2* has a probability distribution given by the density 
function 'ia(0) dff 

We shall denote by Hi the hypothesis that the probability density function 
in the sample space of n observations on Xi, • • • , Xp is given by pu defined in 
equation (6 1). The problem of testing Ho against the single alternative Hi 
is not exactly of the type discussed in Part I, since pi„ given in (6.1) cannot be 
represented, in general, as a product of n factors where the ath factor deiKjnds 
only on the observations Xia , • • • , Xpa . However, it was pointed out in sec- 
titm 3.2 that the fundamental inequalities derived in Section 3.2 remain valid 
also when pi„ is given by an expression of the type (0.1). Thus, we can use the 
sequential probability ratio test for testing Ho against the single alternative Hi, 
We reject Ho if 


pin ^ 
POn 


A. 


( 6 . 2 ) 

we accept Ho if 



SEQUENTIAL TESTS 


177 


and we make an additional observation if 


(6 4) 


R 


< ^ < A. 

POn 


The expression pi„ is given by (6 1) and the constants A and B are chosen so 
that the probability of accepting Hi when Ha is true is a, and the probability 
of accepting Ho when Hi is true is Thus, for practical purposes we may put 


A = 


1 - 


and B = 


/3 


a 1 — a' 

Using the sequential process defined by the inequalities (6.2), (6.3), and (6.4) 
we obviously have 


(6.5) 


[ w{6)0{O)dd 
J n* 




where for each point 6 in p(d) denotes the probability of accepting Ha under 
the assumption that 6 is the true parameter point 

Thus, the sequential test given by (6.2), (6.3), and (6.4) provides a satisfactory 
solution of the problem if we want a test procedure such that the probability of 

an error of the first kind is a and the weighted average / w{d)^{Q) dO of the 

probabilities of errors of the second kind is |8. Practical problems, however, 
do not always take this form. Many instances require a test proceduie such 
that 0(6) should be less than or equal to a given positive value 0 for all parameter 
points 6 whose “distance” (defined in some sense) from 0“ is greater than or 
equal to some given positive value da. The “distance” of two parameter points 
s' and 6^ may be defined by some function 5(6', 6^) which is equal to zero if O' = 9' 
and is greater than zeio if 6' ^ 6^. Furthermore, for any three points 0', 6', 6* 
we have 5(6’, 6”) = 5(6^ 6') and 5(6', 6*) + 5(6', 6') > 5(6', 6’). The distance 
function will, in general, be chosen according to practical needs and mathe¬ 
matical convenience. 

Given the distance function 5(6', ff‘) and given the requirements that the 
probability of an error of the first kind be a and the probability of an erior of 
the second kind should not exceed 0 whenever the distance of the true parameter 
point from 6° is greater than or equal to da , the aim is, of course, to construct 
a sequential test which satisfies these requirements with a' minimum cxfiected 
number of observations. 

While an exact solution of this problem has not yet been found, the following 
approach seems reasonable: Let fio be the set of all parameter points 6 for which 
6(6°, 0) > do. We restrict ourselves to the class C* of sequential tests based on 
Vi 

the ratio — where 

POn 

(6 6) po„ = II/(.Ti„, 6?, ■ 

a-l 


(6.7) 


Pm = f ia(6) n / (xi. 

“Qo a«l 


Bi, 


9k) dB 



178 


A. WALD 


and w{ff) may be any non-negative function of d, called weight function, for 
which 

(6 8) f w(e)dd = 1. 

Jo, 

For carrying out the sequential test two constants A and B are chosen. The 
hypothesis Ho is accepted U—<B,Hi)ia rejected if — > and an additional 

POn POn 

observation is made if J3 < ^ < A. The restriction to the class Ca of sequen- 

Pon 

tial tests is suggested by the fact that we are led to these tests if it is require" 
that some weighted average of the probabilities of errors of the second kind be 
equal to a given value 0. 

Accepting the restriction that the sequential test should be a member of the 
class Ca , we still need a principle for choosing the weight function w(fl). It is 
clear that the maximum of /9(0) in ilo depends on the quantities A, B, and the 
weight function Denote this maximum value by iSwuld., lu(fl)]. Since 

it is desirable to make B, w(&)] as small as possible, it is proix)sed to 

determine w(d) so that the expression B, u)( 0)] becomes a minimum with 

respect to w(0). Since for given values A and B the value of the weighted 

average / u>(d)0(9) d6 is practically independent of ii>(0) (it is nearly equal 

•'Q« 

, minimizing /3 miii[A, B, tn{0)] is practically equivalent to mini¬ 
mizing the difference )3 m«[A, B, wCfl)] — / w(6)0(0) dO. For convenience we 

•'no 

determine w(9) so that/3M«[A,5, w(5)] — / w{6)0(9') dO becomes a minimum. 

•'Do 

For this weight function the maximum of 0{O) in fio will depend only on A and B. 
Denote this value by /3(A, B). Finally we determine the values A and B so 
that 0{A, B) = 0 and the probability of an error of the first kind becomes «. 

The determination of n;(6) is a problem in the calculus of variations. In 
some important cases, however, the solution can be obtained by the following 
simple procedure: Let S{d) be the set of all parameter points 6 for which 
6) = d. Let vie) be a non-negative weight function defined over the 

surface (S(do) so that the surface integral ( y(0) dw = 1 (where dw dc- 

Jaldo) 

notes the infinitesimal surface element). Consider the following sequential 
procedure: Reject Ho if 


(6.9) 


f *'(®) n fi^la, ■ ■ ' , ^pa, 9l, ■ • ' , 9*) 1 doi 
_ La _ * J 

11 fi^ut • • •, Xpa , e\, "' ‘, el) 


is greater than or equal to A, accept Ho if (6.9) is less than’ or equal to B, and 
make an additional observation if the value of (6.9) lies between A and B. The 



SEQUKNTIAL TESTS 


179 


constants A and B are so chosen that the probability of an error of the first kind 

is a and / pid)v{d) do3 = p. In many statistical problems it is possible to 

find a weight function v{d) such that for a conveniently chosen distance function 
d(d\ 0^) the probability ^(0) of an error of the second kind becomes constant on 
the surface S(d) for any value d, and, furthermore, /3(0) decreases with increasing 
d. For such a weight function v(0), the sequential test based on (6.9), will 
provide a solution of the problem. In fact, the weight function v(0) over the 
surface S(do) can be considered a limiting case of a weight function w(0) defined 
in Oo which takes the value zero for any 0 whose distance from 0° is greater than 
do + A with A approaching zero in the limit. For the weight function v{0) the 
maximum of ^(0) in fio is equal to the weighted integral of (3(0). Thus, for this 
weight function the difference between the maximum of j8(0) and the weighted 
integral of (3(0) is minimized. 

We shall illustrate this procedure by a simple example. Let Xi, • • • , Xk 
be k normally and independently distributed variates with unit variances. The 
mean values 0i, ■ • ■ , 0* are unknown. Suppose that it is required to test the 
hypothesis Ha that 0i = • • • = 0|t = 0. Assume that the distance of two points 
0‘ and 0’ is eq(ual to 

+ Vi 0 \ - 0 l)* 4-... 4- (el - el)\ 


Then S(d) is a sphere with center at the origin and radius d. Let i;(0) be con¬ 
stant on S{da) and equal to the reciprocal of the area of S{da). We shall show 
that for this weight function v(0), (3(0) is constant on the sphere S(d) and is 
monotonically decreasing with increasing d. For this purpose we prove first 
that (6.9) is a monotonically increasing function of where f, 

is the arithmetic mean of the observations on x ,. In fact, the expression (6.9) 
becomes 


( 6 . 10 ) 


1 


/ expr-itt(».. 

''iSWo) _ " »"!. a—l 

(2^)*n/J I-"” 



doi 


= c/b exp [— i ndo] / exp [nS^i0,] d« 
JsWo) 

where c* is the reciprocal of the area of iS(da) and £i is the arit hmeti c mean of 
the n observations a:,,, (a = 1, • ', 71 ). Let r, denote | | and let 

«(^) (0 < a < it) denote the angle between the vector (£i, ■ • ■ , Sk) and the 
vector (01, ■ • ■ , 0*.)' Then (6.10) can be written 


(6.11) Cj, exp [—^nd?] / exp (nr, do cos [a(0)])d(i). 

•'sWol 

Because of the symmetry of the sphere, the value of (6.11) ivill not lie changed 
if we substitute 7 ( 0 ) for a( 0 ) where 7 ( 0 ) (0 < 7 ( 0 ) < ir) denotes the angle 



180 


A, WALD 


between the sector 6 and an arbitrarily chosen fixed vector u. From this it 
follows that the value of (6.11) depends only on r,. 

Now we shall show that (6.11) is a strictly increasing function of . For this 
purpose we have merely to show that 


( 6 . 12 ) 



exp (nr*do cos [y{d)])d(j> 


is a strictly increasing function of Tx . We have 


(6.13) = f ndo cos [ 7 ( 5 )] exp (nrido cos [ 7 (S)))d<ij. 

dfTx •'Sida) 


TT 

Denote by ui the subset of S{do) in which 0 < y(6) < and by the subset 
in which ^ < y{6) < it. Because of the symmetry of the sphere we have 

A 


/ 

•'ill a 


ndo cos [ 7 ( 0 )] exp {nrxdo cos [ 7 ( 0 )]) da> 



ndo cos [t — 7 ( 5 )] exp {nrxdo cos [tt ~ 7 (^ 1 ))) dw 


Hence 



ndo cos [ 7 ( 9 )] exp (-nr,do cos [ 7 ( 0 ))) dw. 


dTjrx) 

(6.14) dr. 



cos [ 7 ( 5 )] 


{exp (ndor, cos [ 7 ( 6 )]) — exp (—ndor, cos [ 7 ( 0 )])! du 


The right hand side of (6,14) is positive Hence, we have proved that expre.s- 
sion (6 11) (or (6 10)) is a strictly increasing function of r,. 

To show that fi{6) is constant on S{d) and is monotonically decreasing with 
increasing d, let 1 / 1 , ■ ■ ■ , i/^ be an o rthogonal linear transformation of xi, • • • , Xk 
so that E(y{) = Vel + ■■■ + ol, Eiy,) = 0 (i = 2, • ■ ■ , fc). Since yl + 
• • ■ + + ■ ■ • 4 and since (6.11) deriendH only 011 4 + ■ • •+* 4, 

it is seen that the sequence of expression (G.U) formed for any .sociuencc of 
integers n has a joint distribution which depends only on “s/el ol. 

Hence /3(fl) is constant on any sphere with center at the origin, Since (0.11) 
is a strictly increasing fun ction of r, , it ca n be shown that (3(0) is a monotonically 
decreasing function of Vfi; + • • • + 4 . Hence, we can te.st the hypothesis 
Ho by the sequential process based on (6.10) 

If Ic = 1—^that is, if we test the mean value of a single normal variate—the 



SEQUENTIAL TESTS 


181 


sphere Sid) is a 0-dimensional sphere consisting of the two points di — +d and 
01 = —d and expression (6.10) reduces to 


(6.15) 


^ {exp [-|S«(Xa - do)*] + exp [-§S„(xa + do)*]I 

= J exp [ —JndS](exp [nxdo] exp [—nxdo]}. 


6.2. Sequential test of a composite hypothesis. We shall give only a brief 
outline of the principles on which a sequential test of a composite hypothesis 
can be based, since they are analogous to those for a simple hypothesis. Let 
, ■ ■ ■ , Zj, be a set of p random variables and let /(xi , • ■ ■ , Xp, 0t, • •, Ok) 
be the joint probability density function of these variables involving h unknown 
parameters 0i, ■■■, Ok. Denote the set of all possible parameter points 6 = 
(®i) ■ • • ) Ok) by 0. Suppose that we wsh to test the hypothesis Ha that the 
true parameter point 0 is contained in the subset u of Let w be the set of 
all points of C which are not contained in u. Furthermore, let WoiO) and WiiO) 
be two non-negative functions of 0, called weight functions, such that 


(6.16) 


f W(){0)d0 = 1 and f Wii0)d0 - 1. 

•'5 


If « is a surface in the space fl then the integral over « is meant to l)e the surface 
integral over w. 

In testing a composite hypothesis the probability of an error of the first kind 
need not necessarily be the same for all points 0 in w. It will, in general, be a 
function a(0) of the true point 0 in co. Similarly the probability of an error of 
the second kind is a function ^{0) of 8 defined for all points in w. Suppose that 
we wish to construct a sequential test such that the weighted average 

J wi0)ai0) do of the probabilities of errors of the first kind is a given value 

td 

a, and the weighted average / io(0)/3(ff) dO of the probabilities of errors of 

the second kind is a given value p. Then the following sequential test can be 
used: Denote by H* the hypothesis that the probability density in the sample 
space of n observations on Zi, • • ■ , Xp is given by 

(6.17) po» = / w^(0)[n fi^ii, ■" ,Xp„,0i, ,0k)]d0 

and by Ht the hypothesis that the density in the sample space is given by 

(6.18) Pm = f wi(fl)[n fi^u, ‘ , Ok)] dB. 

The sequential probability ratio test for testing tit against the single alternative 
H* provides a solution of our problem. If the constants A and B in this sequcn- 



182 


A. WAIiD 


tial test are chosen so that the probability is a that we reject lit when if* is 
true, and the probability is ^ that we accept ifo when H\ is true, then for this 
sequential test we have 

f Wo(9)a(6} d9 = a 


and 


I 


wi(dM9) de = 


/3- 


This can be proved in the same way as the corresponding statement in the case 
of a simple hypothesis. 

Frequently we may require a sequential test procedure such that the least 
upper bound of a(fl) in w is equal to a given a and ^(6) is less than or equal to a 
given p for all points 6 whose “distance” (defined in some sense) from to is greater 
than or equal to a given positive value do. The “distance” of a parameter 
point 0 from w may be defined by some function 6(0, to) which is positive if 0 
is not in to and is zero if 0 is in to. The distance function will be chosen in general 
according to practical needs and mathematical convenience. For reasons simi¬ 
lar to those discussed in the case of a simple hypothesis, an appropriate sequential 
test procedure with the desired properties can be found as follows; Let w(d) 
be the set of all points 0 for which 5(0, to) > d. I.«t, furthermore, Wo(d) and 
ioi(0) he two weight functions such that 

(6.19) f «?o(0) de == f wi(e) d 0 = 1. 

Is (da) 

Denote by Ho the hypothesis that the probability density in the sample space 
of n observations on , - • • , Xp is given by 

Wo(0) • • *, , 0) j d0 (n, = 1, 2, •. •) 

and by H* the hypothesis that the probability density in the sample space of n 
observations on Xi, • • ■ , Xp is given by 

(6.21) Pm = f «!i(0) rr[/(a:i«. , a!p«, 0) 1 d0- (n = 1, 2, • •) 

•’uWo) L““i J 

Consider the sequential probability ratio test for testing the simple hypothesis 
Ho against the single alternative Hi . For any 0 in w let «(0) be the prob¬ 
ability of accepting Ht when 0 is true, and for any 0 in u let |3(0) bo the prob¬ 
ability of accepting Ho when 0 is true. It is clear that a(0) and /3(0) depend on 
the constants A and B used in the sequential process and on the weight functions 
Wo(0) and Wi(0). For given A, B, wo{6) and Wi(0) let /3[A, B, Wo{6), wi(0)] be the 
least upper bound of /3(0) in i(do) and let a[A, B, two(0), Wi(0)l be the least upper 
bound of a(0) in u. Consider the difference 

Aa[A, B, Wo{e), Wi{$)] = a[A, B, Wo{e), Wi(0)] - f tno(0)a(0) de 


( 6 . 20 ) 


POfl 


= i 



SEQUENTIAL TESTS 


183 


and 

Afi[A, B, Wa{d), lai(fl)] = ^[A, B, Wo(d), tyi(fl)] — f w,(6)^(0) d6. 

Determine lOo(fl) and W\{9) so that Max [Aa, Ad] is a minimum. For these 
weight functions the least upper hound of a{9) in w and the least upper bound of 
/3(0) in o)(do) will be functions of A and B only. Finally, we determine A and B 
so that the least upper bound of a{9) in w becomes a, and the least upper bound 
of /5(®) in w(do) becomes /3- 

The determination of iuo(0) and wi(9) involves the solution of problems in 
the calculus of variations. However, in some important cases the solution of 
the problem can easily be derived, since weight functions Wd(9) and vii(6) can be 
found for which Aa = A/3 = 0. Such a situation is given, for instance, in the 
following case: Let S(d) be the set of all points 9 for which 5(9, w) = d. Suppose 

that we can find two weight functions Vo{9) and such that / Vo(9) d9 = 

/ vi{9) dS — 1 (dS denotes the infinitesimal surface element of S{(k)) and 

the sequential probability ratio test based on 


f «^iW[n f(xia, • • •, Xpn,, 0)] dS 

•'s(do) _ a _ 

[ /(®la , ' • ,Xpc, (?)] do 


has the following properties: (1) q:(0) is constant in w, (2) /3(fl) is constant on 
Sid) for any d > do (3) /3(0) is strictly decreasing with increasing d in the 
domam d > do . Then for these weight functions we evidently have Aa = 
Ad = 0. 

Let us illustrate this by a simple example. Let X be a normally distributed 
variate with unknown mean n and unknown variance o■^ Suppose that we 
want to test the hypothesis Ho that = 0 and that the distance of the point 

(m, <r) from the set w is defined by - . 

<r 

The set iS(do) then consists of all points (/x, o-) for which n = +docrorn= —dtfi. 
The set o) consists of all points (0, c) where o- can take any arbitrary positive 
value Let r be a positive value. We define the weight functions Vor((r) and 

i^ir(o’) as follows: Vor(a-) = -if0<<r<r and equals zero for all other values of 

r 

<r. The weight function is equal to ^ if 0 < (t < r and ii = ±do(r and equal 
to zero otherwise. 



184 


A. WALD 


Then 

( 6 . 22 ) 

and 

(6.23) 

Hence 




We consider the limiting cose when r —. Then 



The sequential test based on the ratio (0.25) provides a solution of the problem 
if it can be shown to have the following three properties: (1) aid) is constant in 

w; (2) fiie) is only a function of - ; (3) 0{d) is monotonically decreasing with 

n 

2 T. 

increasing . Denote by £ and £ (a:- - by Since the dis- 

o' _ n a_l 

tnbution of ^ depends only on - , the firat two properties are proved if we 



SEQUENTIAL TESTS 


185 


show that the ratio (6 25) is a single valued function of 

First we show that the numerator of the ratio (6 25) is a homogenous function 
of (a:i, • • • , Xn) of degree — (n — 1) In fact, making the transformation 
O' = Xi we obtain 




1 2(Xa:a — do<r)^ 

2 


+ 


1 



1 S(Xaia "h doff)^ \ 

2 Jj 



|d(xo 



1 2 (a:, + dntf ']\ 

2 i~ _ J 


This proves that the numerator of (6.26) is a homogenous function of — (n — 1) 
degree. Similarly, it can be shown that the denominator of (6.25) is also a 
homogenous function of degree — (n — 1). Thus the ratio (6.25) is a homog¬ 
enous function of zero degree in the variables Xi, • • , x„. 

It can be seen that (6.25) is a function of the two expre.ssions 2 Xb and 2a;„ 
only; i.e.. 


(6.26) ^ = <^( 24 , Sxa). 

POn 

Let V = I V2^ I . Since (6.26) is a homogenous function of zero degiee, its 
value is not changed by substituting for a:„ . Hence, 


(6.27) 

Since if>(Sxi 


— Zx„) = <^(2Xa , Sxa), we see that 



Since is a single valued function of ^ , we have proved that --- is a single 
" _ Po„ 

X 

valued function of ■= . 

jS 

In order to prove property (3) of the sequential test based on the ratio ((),26), 

we have merely to show that (6.25) is a strictly increasing function of 
-2 

Since is a strictly increasing function of | , we have only to show that 

-2 

(6.26) is a strictly increasing function of ^ . The latter statement is obviously 
proved if we show that (6.25) increases with increasing value | x j while keeping 




186 


k, WALD 


V 6xed. For fixed value of u the denominator of (6.25) is constant. Thus, we 
have merely to show that the numerator of (6.26) increases with increasing 
I i I while keeping y fixed. This follows easily from the fact that 


exp 



+ exp 


-j'Sx,) do 

(T 


is a strictly increasing function of j 51. 


REFERENCES 

[1] H F. Dodoe ane H.'G, Romiq, "A method of sampling inspection,” Ths Bell SytUm 

Tech. Mr , Vol. 8 (1929), pp, 613-631. 

[2] Walter Bartkt, "Multiple sampling with constant probability”. AnruiU of Maih 

Slat , Vol. 14 (1943), pp, 363-377. 

[3] Harold Hotelling, "Experimental determination of the maximum of a function”, 

Annals of Math. Slai , Vol, 12 (1941) 

[4| Abraham Wald, “On cumulative sums of random variables”, Annalt of Math. Slat., 
Vol, 16 (Sept 1944), 

[6] Z. W. Bibnbaum, “An inequality for Mill's ratio”, Annols of Math, Slat., Vol. 13 (1942). 
(61 P C, Mahalanobis, “A sample survey of the acreage under juto in Bengal, with dis¬ 
cussion on planning of experiments," Proc. 2nd Ind. Stat, Conf., Calcutta, 
Statistical Publishing Soc. (1940) 

[7] Abbaham Wald, Sequential Analysis of Slalislical Data; Theory. A report submitted 

by the Statistical Research Group, Columbia University to the Applied Mathe¬ 
matics Panel, National Defense Research Coinnuttee, Sept, 1943. 

[8] Harold Freeman Sequential Analysis of Slalulical Data. Applications, A Report 

submitted by the Statistical Research Group, Columbia University to the Ap¬ 
plied Mathematics Panel, National Defense Research Corninittee, July 1944. 

[91 G. A. Barnard, M.A,, Economy in Sampling unih Reference to Engineering Experitnenla- 
tmt (British) Ministry of Supply, Advisory Service on Statistical Method and 
Quality Control, Technical Report, Series 'R', No, Q.C,/R/7 Part 1. 

[10! C, M, Stockman, A Method of Obtaining an Approximation for the Operating Character¬ 
istic of 0 Wald Sequential ProhaUhly Ratio Test Applied to a Binomial Dislnhu- 
tion, (British) Ministry of Supply, Advisory Service on Statistical Method and 
Quality Control, Technical Report, Series 'R' No. Q C./R/19, 

[11] Abraham Wald, A General Method of Deriving the Operating Characteristics of any 
Sequential Probability Ratio Test. A Memorandum submitted to the Statistical 
Research Group, Columbia University, April 1944, 



NON-PARAMETRIC ESTIMATION. I. VALIDATION 
OF ORDER STATISTICS 

By H. Schepf^3 and J. W. Tukey 
Syracuse University and Princeton University 

1. Summary. Previous work on non-parametric estimation has concerned 
three problems: (i) confidence intervals for an unknown quantile, {ti) population 
tolerance limits, (m) confidence bands for an unknown cumulative distribution 
function (cdf). For problem {iii) a solution has been available which is valid 
for any cdf whatever, but for (i) and (ii) it has heretofore been assumed that the 
population has a continuous probability density This paper validates the 
existing solutions of (i) and (ii) assuming only a continuous cdf. It then modifies 
these solutions so that they are valid for any cdf whatever. 

2. Introduction. There are three problems of non-parametric estimation 
(we exclude point-estimation) for which fairly satisfactoiy solutions are available; 
their present status was summarized in a recent paper [4]. The purpose of this 
series of articles is to extend and complete the theory of non-parametric estima¬ 
tion in directions of both theoretical and practical interest 

In this series we shall employ the following conventions of notation: We dis¬ 
tinguish between a random variable and an arbitrary point in the Euclidean 
space containing its domain by using a capital Roman letter for the former and 
the corresponding lower case Roman letter for the latter. Thus if X is a (scalar) 
random i ariable, and x a real number or ± <», we speak of the probability that 
X <x and denote it by Pr{X < xl. Roman capitals will also be used to denote 
cumulative distribution functions’ (cdfs): A monotone non-decreasing function 
P(x) will be called the cdf of X if P(x -+- 0) = Pr{X < xj The definition of 
F(x) at its points of discontinuity will be immateiial. Again, E = (Xi, • • ■ , 
X„) will denote a random sample from a population with cdf F(x), whereas e = 

, • , .Xn) will denote a point in the sample space Rn. If t is a function of e 

only, t = ip{e), then the random variable T = tp(E) is a statistic. The order 
statistics of the sample E are defined to be — , ■ ■ • , Z„, -b », where Zi < 

^2 < • • < ?n is a rearrangement of xi, X 2 , • ■ , x„. We shall write Zq = 
“ °°, Zn+i = + ”. The device of including + » and — « among the order 
statistics will enable us to avoid special statements to cover the case of one-sided 
estimation. Confidence coefficients will be denoted by 1 — a. Finally, it will 
be convenient to symbolize^ the following three classes of cdf’s: Slo is the class of 
all univariate cdf’s P, fij, the class of all continuous P; fli, the class of all P uith 
continuous derivative P'(x). 

^ One of the authors wishes to point out the need of a clear, concise, and adequate term 
for this basic and important concept 
“ The notation follows [3] 


187 



188 


H. SC’FIEFFJ5 and J, \V. TtrKKV 


We now list the three i)roblems. In each case it Is vindcrstotjcl tliat the solu¬ 
tion sought is to be valid for all cdj's in some chosen cUusk. The names^ asso¬ 
ciated with the problems arc (f) W. R. Thompson, K, R. Xair, (ii) Wilks, {-in) 
Wald, Wolfowitz, Kolmogoioff. 

(f) To find confidence intervals for an unknown fiimntile (jp , where is 
defined by F{qp) = p, 0 < p < 1; in other words, to find statistics Ti , Tj such 
that^ 

(1,1) Pr|7’i < - 1 - 

(n) To find tolerance limits Ti , Ti which, with oonfidence 1 ~ will cover a 
proportion h or more of the population, that is, 

(1 2) Pr[Fi2\) - F(Ti) > 6 | F] = 1 ~ 

{%ii) To find a confidence band for an unknown cdf F, that is, a random region 
)?(£') in the a:,i/'plane such that 

(1,3) Fr[R{E) covera </(/<’) = 1 — 

where g is the graph of y = F{x). 

The existing solutions of problem (m) arc known lo be valid for F in , 
but those of problems (i) and {vi) have been validated only for F in iU The 
extension to F in Slj is an immediate eonsequenee of the theorem in section 4; 
this section also contains a discussion of some other Irnpliealion.s of the theorem. 
In section 5 the, appropriate modifications of tin* solulums of problems (f) and 
(m) are found which extend their validity lo the' genc'val ease F in Ho. Whereas 
Pitman ([1]; also [4], p. 310) has shown how non-paraiiietilo testa may be ex¬ 
tended to the possibly discunlinuous case, the only solution of the three estima¬ 
tion problems previously extended to this case is that of Kolmogoroff for pioblem 
(iri). Extension from Qi to Slo is of considerable practical interest, not only 
in the case of populations ordinarily e,onsidcred discrete, but also as affecting 
the problem of the finileness of the number of significant figures in mea.suremcnts 
and the resulting occurrence of "ties” in ranked measurements. Before making 
these extensions we discuss in the next section the transformations on which 
they are ba.sed. 


3. Two useful transformations of random variables. We shall reserve the 
symbol X* for a random variable having a uniform distribution on the interval 
from 0 to 1. itsed/is 


(1.4) 


U{x*) = Pr{X* < X*] 


0 if .T* < 0, 

■ X* if 0 < .r* < 1, 
^1 if .T* > 1, " 


’For bibliography see [4], 

’ The notation Frit? | Fo) denotes the piobabilityof llie relation Ii being true, calculated 
under the assumption that tlie cdf of the population is F^ix) 



NON’PAKAMETRIC ESTIMATION 


189 


The device of transforming from any random variable X with cdf F in ili 
to one with cdf U was early used by Karl Pearson and more recently by many 
others; it is known m the literature as the “probability integral transformation.” 
We define the transformation ai* = hy{x) as follows: For — “ < a; < 
hfix) = F{x), hr{+ 00 ) = + 00 , h^{— to) = — oo If F is in f 22 , the following 
statements are evident for the transform X* = hffX): X* has U{x*) as its cdf. 
With X* = hr{X,), a random sample E = (Xi, • • , Z„) from F transforms 
into a random sample E* = {X* , • • • , X*) from U. The order statistics 
{Zt\ of E transform into the order statistics [Z*} of E* with Z* = hf{Zf), 
t = Q, I, ■■■ ,n + 1. 

It is easily seen that if F is not in fi! 2 , the above transformation Y = hj{X) 
does not give F the cdf U ; indeed, if F is not in , the cdf of any single-valued 
function F of Z is also not m , for there will be at least one point a: = a;o with 
positive probability, and likewise for its transform ija. Nevertheless our argu¬ 
ments in section 4 depend on relating a random variable with arbitrary cdf F in 
flu to the uniformly distributed X*. While it la not possible to transform from 
X to X*, without introducing a further random process, ii is ‘possible to transfonn 
directly from X* to X. This suffices for our needs We shall alway.s denote 
this transformation by X = (jr{X*). The following definition of the function 
X — gr{x*) makes it independent of the normalization of F at its disc.ontinuitie.s: 

(1.5) Fix - 0) < Uix*) < Fix + 0) 

A sketched diagram may aid the reader in following the aigiimerit: To every 
X* i —X) < X* < +«j) there corresponds at leivst one x, and this x is uniciuo 
unless it lies in an interval to which F a.ssign.s zero jirobability In the latter 
case we shall assume that some x in the interval is designated to be grix*), It 
will be seen that it is immaterial which x is thus cho.sen. However if a: — — w 
or + CO is in an interval of constancy of F we. specify gri — <») = - , (/r(+ “) = 

+ “ . 

To prove that g^iX*) has the cdf Fix) and thus can lie identified with X, it 
is sufficient to prove that Pr (£?f>(X*) < x j = FCr + 0). Now griX*) < x if and 
only if X < x+, where 

x+ = sup . 1 ;*. 

I—Ujilx*) 

Hence Prl( 7 ^(X*) < x) = Pr\X* < .rt) = - Fix + 0). It follows 

that a random .sample E* from U transforms into a random sample E from F. 
The transformation preserves the relation that is, if ~ grix*), xt = 
grixi,), then Xa < x,, implies Xa < Xt. This means that the order statistie.s 
(Ft) transform into the order statistics [Z,] of E, We remark that 

Xa < Xi does not imply Xa < xt, , there is trouble when x* < 0 or > 1, and 
more serious trouble if .r^ and xt both go into the same discontinuity of F, 
However, we shall need to utilize the fact that Xa < xi implie.« xt < xt . 



190 


H. SCHEFF^ A.ND J. W. TUKEY 


4. Extension to continuous cdf's. A sufficient condition on Ti and Tj for a 
solution (1.2) of problem (ti) to be valid for all A in Slj is clearly that the joint 
distribution of F(T,) and F(T 2 ) be independent of F ia Qn. If Pr[F(T,') = 
p I F} = 0 (i = 1, 2), then (1.1) is equivalent to 

(1.6) Pr(F(r,) <P<TO)|F1 = 1-a, 

and so a sufficient condition that a solution (1.1) of problem (t) be valid for all 
F in is again that the joint distribution of F{Ti) and F{Tt) be independent of 
F in lia. We are thus led to consider sufficient conditions on a set Tj, 7*2, • ■ ■ , 
Tr of statistics, which will insure that the joint distribution of F(Ti), F(1P2), 

• • • , F(r,) be independent of F in lb. 

Theorem: A sufficient condition for the joint distribution of F{Ti), FiTf), ■ • • , 
F{Tf) to be independent of F in ^2 ts that the (F/) he a subset of the order statistics 
{Zt\ of the sample. 

To prove the theorem it will suffice to show that the joint distribution of the 
set of n random variables F(Zi), F{Zf), • - • , F{Zn) is independent of F in Sb. 
Let the cdf of the joint distribution be 

(1.7) ^,(Ai, X2, • ■ ,K) = PrWiZy) < Xi, •. • , F{Z„) < X„ | F). 

Employing the transformation x* = h/-(x) discussed m section 3, w'e see that the 
above probability equals 

(1.8) PrlZf < Xi,...,z: < X„), 

where Z* , zt, • ■ •, are the order statistics of a random sample E* from the 
uniform cdf U. But this probability does not depend on F. 

Since the existing solutions of problems (i) and (n) are obtained by taking 
Ti and Ti to be order statistics, rve have validated these solutions for all F in 
O 2 . That the existing solutions of problem (Hi) are valid for F in fb has been 
demonstrated by their authors; this is however also an easy consequence of the 
above theorem. The sufficiency condition expressed by this theorem together 
with a necessity condition of Robbins’ [2] may indicate a natural path to the 
formulation and solution of further problems of non-parametric estimation. 

From a theoretical point of view it is of interest to note that even in those 
pathological cases where no probability density function exists for the cdf F 
in (F is non-absolutely continuous), the joint distribution (1.7) of F(Zf), 
FiZf), ■■■ , F(Zn) always possesses a density. That this density is nl for 0 < 
F(Z\) < F(Zf) < • ■' < F(Z„) < 1, and zero elsewhere, is evident if we consider 

(1.8) . By “integrating out” the other variables we are led to the following 

practically useful result (it is well known for F in Hi): Choose any set (ry) 
of s integers (1 < rj, < r 2 < ■ • ■ < r, < n), and consider the joint distribution 
of F(Zri), ) F(Zr,). This has a probability density function f(ti, 

b , ■ • • , U), providing F is in O 2 , given by the formula 



NON-PARAMETRIC ESTIMATION 


191 


(1.9) fituti, 
for 0 < k < ti < 


I k) ~ 


nUV-\l - i.y 


n 


— uY 


(r.+i - r. - 1)! 

< < 1, and / = 0 elsewhere. As is conventional, the 

0 

result of applying H is to be interpreted as unity, and the meaning of / is 

t-L 

given by 

Pr{F{Zr,) <a,(i = 1, 2, ■■-,5)1^1 

/ ai pa 2 pa, 

/ *■• / fik,k, ■ ■ ■ , t.) dt. ■ ■ • dk dk. 

• to J—tc J—90 


6. Extension to discontinuous cdf's. Suppose we have a solution of problem 
(i) based on order statistics and hence valid for F in Oj, say, 

(1.10) Pr{Zt<q,<Zt\F} = l-a, 

where 0<fc<i<n+l In particular this is valid for the uniform case, 

(1.11) PT{Zt < p < ZD = 1 - a. 

We now transform from the uniform cdf U to an arbitrary F in flo by means of 
the transformation x = gf{x*) described on section 3. Suppose qp is defined 
by Qp = Pj'(p). This means the quantile qp of the distribution with cdf F is 
determined from the relation 

F{qp - 0) < p < Fiqp + 0), 

which assigns to the quantile its usual meaning if F{x) is contmuous and non¬ 
constant at a: = gp, and a sensible definition if F is discontinuous or constant 
at qp . From the discussion in section 3 we have 

(Zi < qp < Zt) implies (Z* < p < Z*) implies (Zj. < Qp < Z(), 

and hence the probability relations 

Pr{Z, <qp<Z,\F] < Pr{Zt < p < Zf} < Pr[Z, < 5 p < Z, | F}. 

Substituting (1.11), we have 

(1.12) Pr{Zu < qp < Zt\F} < 1 - a < Pr(Zt < qp < Z, |F}. 

The statistical interpretation of (1.12) is the following; Consider any solution 
(1.10) of problem (i), giving a confidence interval for the quantile qp , valid for F 
in . Then with the same values of n, k, t, and a, the probability of the random 
interval from Zi to Zt covering the unknown quantile g, is < 1 — a for the open 
interval, >1 — a for the closed interval, no matter what the unknoum cdf F, 
If F is continuous, the two probabilities are of course equal. 



192 


H. SCHEFFil AND J. W. TUKEY 


To extend the soluuon of problem (iz) to the general case F in Oo, suppose we 
have a solution (1 2) using order statistics, say Ti = Zk, = Zi if) <. k < i < 
n + 1). Such a solution will be valid for all F in , in particular for F = U, 

Pr{V{Z*) = 1 - a. 

Given now any arbitrary distribution F, we again use the transformation x = 
gk(x*). From (15), 

F(Zi - 0) < U(Z*) < F(Z, + 0) (f = kt). 

Hence 

< B* < B+, 

where 

B. - F(Z, - 0) - F(Zk + 0), 

B* = u{zt) - mzt), 

B+ = F(Zi + 0) - F{Zk - 0). 

The implications 

(B_ > b) implies {B* > b) implies (B+ > b) 
yield the relations 

Fr(B_ > bj < PrlB* > b} < Pr(J5+ > 6). 

These may be written 

(1.13) PriF(Z, - 0) - F(Zk + 0) > 6 (F} < 1 - a 

< Pr{F(Z, + 0) - F(Zk - 0) > b IF} 
To interpret (1.13), let us say that a Borel set S covers a proportion it of a 

population with cdf F(x) if / dF(x) = ir. If S is an interval from x' to x", 

then the proportion coveicd by 5 is F(x" + 0) — F(x' — 0) if jS is closed, and 
F(x'' — 0) — F(x' -i- 0) if jS is open. The proportion covered by a point Xo 
is the jump F(xo + 0) — F(zo — 0) of the cdf F at % . The statistical meaning 
of (1 13) is now clear: For the random interval from Zk to Zt, the probability 
that the open interval cover a proportion > 6 of the population is < 1 — the 
probability that the closed interval cover a proportion > b of the population is 
>1 - a, regardless of the population. Again, for a continuous F the two 
probabilities are equal. 

references 

[1] E J G. Pitman, Supp? J Roy Slal Soc.,Yo\ 4 (1937), pp, 117-130 
[i] H. Robbins, AnnaZs of Math Slat, Vol. 15 (1944), pp. 214-216 

[3] H SchbffS, Annals of Math. Slat., Vol, 14 (1943) pp, 227-233 

[4] H. ScHEFFi, Annals of Math. Slat, Vol 14 (1943), pp, 305-332 



ON A TEST FOR RANDOMNESS BASED ON SIGNS OF DIFFERENCES* 

By Henry B. Mann 
Ohio State-University 

1. Introduction. It has been pointed out by J. Wolfowitz [1] that we cannot 
expect a test for randomness to be most powerful with respect to every possible 
alternative. It is therefore necessary to find tests designed to distinguish a 
random sample of observations from the same population from a sample coming 
from some particular class fl of distributions. Such a test need be consistent 
in the sense of Wald and Wolfowitz [2] only with respect to alternatives in the 
class fl. 

Let xi, ■ ■ ■ , Xn be the measurable quality characteristics of n units of a 
manufactured article. We shall assume that the distribution of x. is continuous. 
According to Shewhart the production process is termed "under statistical 
control” a xt ,■■■, Xn can be regarded as a random sample of n independent 
items each coming from the same population with known or unknown distribu¬ 
tion function. 

In a random sample p,- = p(a:, > x.+i) = where PiE) denotes the prob¬ 
ability that E will hold. The class Q of alternatives which we shall consider is 
described as follows. The cumulative distribution of .x, is /, and the , f = 
1, 2, • • ■ , are such that 

tMtl 

p. = ^ -f X„(n — 1), lim inf X„ = X > 0. 

Such a situation may, for instance, obtain of the production process is under 
statistical control except for occasionally but not too infrequently occurring 
periods during which the quality of the product decreases, after which decrease 
statistical control is immediately restored. If the decreases m quality are sharp 
enough or the periods of decrease long enough, then the alternative will belong 
to the class described before. 

To give a practical example; consider a drill, which after some period of use will 
wear off so that the quahty of the manufactured article will decrease until the 
drill is exchanged. After replacement of the drill by a new one, statistical con¬ 
trol is immediately restored. Now, if the drill is not replaced in time, the 
periods of decrease in quality will be long and the rate of decrease will become 
rapid so that the sequence of distribution functions will satisfy the conditions 
of the class Q. A similar situation occurs also in time studies. For instance, 
in the foregomg example, the time necessary for drilling one hole will tend to 
increase when the drill is too long in use. 

The following test first proposed by Moore and Wallis [3] for the study of 


‘ Research under a grant of the Research Foundation of the Ohio State University 

193 



194 


HENRY B. MANN 


economic time senes seems appropriate for our purpose: I-ct Si, • • • , a;„ be the 
sample and form the sequence X 2 — Xi Xn — x„-i . Let S be the number 
of negative differences in this sequence. Clearly, the distribution of S is in¬ 
dependent of the distribution of i, provided the sample is an independent random 
sample from a continuous distribution* Under one of the alternatives of the 
class Q, S will in a sample of n tend to be larger than in a random sample if X„ > 0. 
Hence S may be used as a statistic to distinguish between randomness and any 
of the alternatives of the class n. The distribution of S was tabulated by 
Moore and Wallis [3] for n < 12. They also found empirically that S approaches 
a normal distribution. The asymptotic normality of the distribution of S 
can be proved rigorously in a way analogous to the proof of Theorem 1 of a 
paper by Wolfowitz [4]. The first four moments of 5 were obtained by Moore 
and Wallis. The fourth moment, however, only by empirical methods. In 
this paper we shall derive a formula which makes it possible to compute the 
moments of S recursively. With the help of this formula we shall indicate an 
alternative proof of the asymptotic normality of iS using the method of moments. 
Finally, we shall derive a lower bound for the power of the S test with respect 
to alternatives in fi valid for large n and depending only on X„ . 


2. The moments of S: Let Pk{S) bo the number of permutations inn variables 
with *S negative differences. MacMahon [5] ha.s shown that 


( 1 ) 


PniS) = + l)P„-i{S) + in - S)P„.2iS - 1). 


Using (1) Moore and Wallis [3] have tabulated P( 




S - 


s- 




In using their table for our purpose, one has to keep in mind that we are using 
a one tail region, therefore P(<S > S) is for > —~— one half of the value 

£1 

tabulated by Moore and Wallis. 

^ ^ 

Clearly the first moment of >5 is —-—, since the expected value of — signs 

a 

equals the expected value of + signs. To find higher moments we multiply (1) 
by divide by n! and sum over S. Then we obtain 

(2) a [(5 - L^)']. i [(s - (s +1)] 


+1 ®.-. [(»-«- 1 ) + 1 )‘]. 


where F1„[/(S)] denotes the expectation of f{S) in permutations of n variables. 
From (2) we have 



TEST FOR RANDOMNESS 


195 



Putting S ~ E(S) = X we obtain 


(3) E„{x') = E„-, [x{x - ^)‘ - x{x + m + Pn-i [{X + hY + -iYl 

n 

From the symmetry of the distribution as well as from 3 it may b' ien that 
all odd moments are 0 and therefore 

P[(x + i)“+(x-if] = F(x + if 

E[x{x - hY' - x{x + i)”] = -2E[ix + + E{x+ . 

Hence we obtain from 2 

(4) Enix^^^^) =0, t = 0, 1, ••• 

eM') = E^., [(X + |)«] - ? [(x + 

n n 


If all moments below the 2fth moment are known (4) becomes a ( erence equa¬ 
tion whose solution yields the 2ith moment for n > 2i. Thus oi obtains 

^ f(s\ — V _ «■ + 1 TP _ 5(n + 1 )* — 2 -|- 1) 

o^n('S) — E„(X )- -jr— , E„{x ) — , 


E„(x‘) = 


35(71 + D" - 42(71 4- 1)“ + 16(71 + 1) 


It is not difficult to prove from (4) by induction that lim-^^— = (2t — l)(2i —3) 

"-*«> lTn(S) 

• • • 3.1. To do this one proves first by induction that E„(x^') is for ti > 2t a 
polynomial in n of degree t. It can then be proved by induction that the first 
coefficient of this polynomial is (2t — 1)(27 — 3) ■ • > 3.1/12* from which the 
assertion follows. Since (2i — 1) • • • 3.1 are the moments of a normal distribu- 

(s -VB 

tion with variance 1 it follows that - - ' - is in the limit normally 

Vti + 1 

distributed with mean 0 and variance 1. This result follows, however, also 
easily from Theorem 2 of a paper by Wolfowitz [4]. 


is in the limit normally 



196 


HENRY B. MANN 


It is also possible to show by induction from equation (4) that for n > 2i the 
22th moment of S is smaller than the corresponding moment of a normal distribu¬ 
tion with variance . 

J. A 

3- The power of the S test. Let us assume now that one of the alternatives 
of the class fi is true This is to say p, = Pix, > = ^4- t., 

S = An(n — 1)) lim inf X„ = X > 0. Let 


2. = 


[l if the fth sign is —, 
[o if the tth sign is 


We shall show that 


We have 


P( 2 .+i = 1 I z.- = 1) ^ P(z.+i = 1). 


r f dUxi) f d/3(xj)1 f dfiiXi) < f dfi(Xi) f df3(xs) [ d/j(xj) 

< [ df'iixt) r f dftiXi) f dfiixi). 

Adding [ dft(x 2 } [ dfi{xi) f dfiC^i) to both sides of this inequality we 
J—aa J 


have 


f df 2 (xi) f dfz{xi) < f dMxi) r [ d/z(xj) [ dfi{xz). 

J^QQ JL«||g OQ I 1 ^—qO tJa^OO 

Integrating both sides with respect to xi, we obtain 
[ dfi{xi) f dfiixi) [ dfzixz) 

•l—CO *^(|Q 

< dMxi) d/!(a:s)Jj^/ dMxi) £ d/3(xj)j 


or 


P(zi = 1 and Z 2 = 1) < P(st = 1) ■P(zt — 1). 

From this it follows that < 0, Since = J — e? we have xj < ” ^ 

S::r' ^ d - IX;). Moreover AX-S) = + X„(n - 1). 

•i 2 

Let X' = X if X <4 and 0 < X' < X if X == ^. The critical region is for suffi¬ 
ciently large n given approximately by iS > + t i where t 

2 Y 



TEST FOR RANDOMNESS 


197 


1 r 

depends on the level of significance a and must be chosen so that J 

dx = ct. Hence, if we can show that under any alternative H of the class 
0 and for any « > 0 


(5) 


P{S > B(S) - ^ V(n - 1)(1 - 4X'^)) < ^ dx + 


for every t>t>0,n>N(£, H, T), then we shall be able to give a lower bound 
for the power of the S test. The power of the S test is approximately given 

byp(s>?^' + iy^). 

From (5) we have 

(6) VTsn-aui-iX'*) 

for ‘ w n - 2X.(« - 1) V3 ^ ^ „ 

V5{n - 1)(1 - 4 X ' 2 ) 




dx — e 


n > H, T). 


The author considers it safe to assume that (6) holds with a fairly small t for 
n > 12 if X' in (6) is replaced by Xn where X'„ = Xn if X„ < ^ and X^„ < ^ if Xn = ^ 
and if X^ is not too close to j. He bases this belief on the rapidity with which 
the distribution of S approaches noimality under the null hypothesis of random¬ 
ness, and on the fact that at least under the 0 hypothesis the moments of S are 
smaller than the corresponding moments of a normal distribution. It may also 
be seen from the following derivation of (6) that in many cases the power of the S 
test ivill be considerably above the lower bound given in (6). 

To prove (5), we need the following two lemmas 

Lemma 1. Let P(,x ^ i ) = f{t). Let further E{z) = 0, E { z ) = «. Then for 
every 6 > 0 


(7) fit + 5) +j,> P{x + z <l) > fit - 5) - “- 

Proof: Applying Tschebycheff's inequality we have 
Pii! + z <t) < Pix < t + d) + Pix > I + dmdz < -S) 


< P(x < f + fi) + P (2 < -5) < fit + 5) + i , 
P(x -f 2 < <) > P(x < ^ — 5 and z < 5) 

> Pix < t - 5) - Piz > S) > fit - 8) - 1 . 



198 


HENRY B. MANN 


Lemma 2. Let (x.), i = 1, 2, • ■ • be a sequence of independent random variables 
with mean 0 bounded kth absolute moment, k > 2, and variance a] . Let M > 0 

1/2 .T r _--1.1_ 


ond lim sup ^ 


Hr. Form the sequence of random variables — 


then for any e > 0 and any I > I "> 0 

, r~‘ 

( 8 ) 




P{v„ < - i) < £ e~*'*’dz 4- e for n > N(f, 1). 


Proof. Form a sequence ma with, lim m« = 0. Let = a:“ + 

«t^ae 

where 

2« „ ^ 

" M s/n’ " M Vn 

denotes summation over all i for which a\ > m^ and all Bums extend from 
one to n. 

Let be the distribution of x, then by Lemma 1 

rni-t + + ^ ^ 


Now we distinguish two cases. 

Ist Case. The number of integers i with a] > ma is for some a of order n. 
In this case (/JI1 differs arbitrarily little from a sequence of normal distributions 
with mean 0 and the upper limit of the variances at most 1. 

Md Case. The number of integers i with cr< > m^ ib for every a of smaller 
order than n. In this case x“ converges stochastically to 0. In both cases 
(8) holds true since ma can be chosen arbitrarily small. 

We can now prove (5). It follows easily from Tschebycheff’s theorem that 
(5) is true if X = Hence we may assume X < ^. Let Zi be defined as at 
the beginning of this section. Form 

.,v_ 2(2< - E{z,)) 

Via - 1)(1 - 4X=)’ 

, 2(zyt - E{z„)) 2iz, - Eizf))__ 

V(n - 1)(1 - 4X»)' "" V(n - 1)(1 - 4X=) 

where m' = gk is the largest integer multiple of k which does not exceed (n ~ 1). 
We form further 


Since o-;* < j 


a: n = 22 y M 2 

/-I 

\(g - 1) 


jmf 

h V' It 

« = w,. 

1-1 


i(7i - 1)(1 - 4X') - A:(l - 4X‘) 


^ it follows from Lemma 1 that 


the distribution of 


2{S - Em 
Vin - 1)(1 - 4X*) 


differs arbitrarily little from the distribu- 



TEST FOR RANDOMNESS 


199 


tion of x\ for sufficiently large n and k The second and the third absolute 
moment of \/n — 1 v] are bounded. Hence -s/n — \ v] fulfills the condi¬ 
tions of Lemma 2. The application of Lemma 2 yields (5) and conse¬ 
quently 6 . 

The integer N{t, H, 1) is independent of t provided the lower limit of the 
integral does not exceed —1. Hence we have proved 
Theorem. Let U , U , •• • be any sequence of numbers satisfying the condition 
-7 = -b 1 - 2(n - l)Xn Vs ^ ^ f) 

V(3n - 3)(1 - 4X'2) 

where X' = lim inf Xn if lim inf Xn < ^ and 0 < X' < | otherwise. Let PniS, H) 


he the power of the S test with respect to the alternative H and critical region S > 

Then 


2 +‘"4 

/n+ 1 
/ 12 ' 

(9) 

lim inf 


lim inf P„(/SiH) / — 7 = f e ^^^dx > 1 . 

n-»"0 L ' V ■"ir J 


It is worthwhile to remark that (9) is sharp. That is to say there exist alterna¬ 
tives for which the left side of (9) is equal to (1). This is obviously the case 
for any alternative with P(x, > aii+i) = ^ -|- X and ?{zi = 1 and Z{+i = 1) = 
P(z< = l)'P(z,+i = 1 ). These conditions are, for instance fulfilled by the 
alternative given by P(a;<+i = a •- 5 — • • • — 6 ‘) = § + X, P(i,+i = C ■+■ 5 + 

25 

... 4 - 5') = ^ — X, i = 1,2, • • ■ where (a — c) > .-; > 0. 

i ^ 0 

If = < for every n then (9) implies the consistency of the test if the order 
of Xrt is larger than l/Vn- It may also be seen that the test is not consis¬ 
tent with respect to alternatives for which X„ is of order at most equal to . 

This remark refers of course only to alternatives for which Xi is independent of 
X, for i j. 


REFERENCES 

[1] J. WoLFOWiTZ, "On the theory of runs with some applications to quality control,” 

Anmli of Math Slat , Vol 14 (1943), pp. 280-288. 

[2] A Wald and J Wolfowitz, “On a test whether two samples are from the same popula¬ 

tion,” Annals of Math. StaL, Vol. 11, (1940), p 147 

[3] Geoffrey H. Moore and W, Allen Wallis, "Time senes significance tests based on 

signs of differences,” Jour o/(Xe Amer. Sfaf. Assoc,, Vol 38, (1943), pp 133-165, 

[4] J Wolfowitz, “The asymptotic distribution of runs up and down”. Annals of Math. 

Slat Vol, 15, (1944), pp 163-173, 

[5] P, A, MacMahon, “On the compositions of numbers”, Phil. Trans, of the Jioyal 80 c. of 

London, Vol 207, pp 65-134 



THE ASYMPTOTIC DISTRIBUTION OF RUNS OF CONSECUTIVE 

ELEMENTS 


By Irving Kapla.nsky 
Nm York CtUj 


In a permutation of 1 , 2 , • • • , n let r denote the number of instances in which 
i is next to f -f- I, i p-, m which either of the snccessions (f, i + 1 ) or (i + 1 , i) 
occurs. Thus for the permutation 234651, r = 3. In [3] Wolfowitz* has pro¬ 
posed the use of r for significance tests m the non-paramctric case, and in [4] 
he has shown that asymptotically r has the Poisson distribution wdth mean 
value 2 . It is to be noted that W{R), the number of runs as defined by Wolfo- 
witz, is equal to n — r. 

In this note we shall derive more explicit results concerning the asymptotic 
distribution of r. In a random permutation (all jiermutations lieing regarded 
as equally probable) let the probability of exactly r successions as above be 
P(n, r), and let M(n, k) denote the fc-th factorial moment of the distribution, 
that is 


M{n, k) = 2 rr(r — 1) • • • (r — /: + l)P(n, ?) 


We shall show that 

( 1 ) M{n, k) = 2^ I 

2'' r 

( 2 ) Pin, r) = 1 


k + 1 (k\k k + 2fk\k{k ~ 1) 
2 k 2 U'‘W n(n-'!) 


- 3r r* - 8 r^ + + 22r - 16 

2n 8n{n — 1 ) 


+ 0 (n“’), 


Since 2 *" is the A:-th factorial moment of the Poisson distribution witli mean 2 , 
either of these results serves to verify the asymptotic Poisson character of the 
distribution of r. 

It would be pos.sible to obtain some kind of explicit formula for the general 
term of ( 2 ), but there seems to be no reasonably simple form. 

Proof of (1). Let A, denote the event “i -f 1 comes right after i” and B, 
the event "i comes right after i I” {i = 1, ,n ~ 1).. The joint prob¬ 
ability of k of these 2 n — 2 events is either 0 , if they are incompatible, 
or (n — A:)l/nl if they are compatible—for in the latter case w'O in effect assign 
positions for fc of the elements and are then free to permute the u ~ fc others 
Let/(n, fc) denote the number of ways of selecting fc compatible events, Then 
it is known that ([1], eq. (40)) 

(3) Min, fc) = fc!/(n, fc)(n - fc)!/n! = /(n, k)/(^ . 


‘ I am indebted to Dr Wolfowitz for calling my attention to this problem, and to its 
identity with what I called the “n-kmgs problem” in 12], 

200 



DISTRIBUTION OF RUNS 


201 


The relations of incompatibility can be summarized by the statement that 
Ai is incompatible with if 1 1 — J 1 ^ 1. In view of (3), our task thus reduces 
to the proof of the following combmatorial lemma 
Lemma. Suppose 2n — 2 objects A\, • ■ , An_i, Bi, , B„_i are given. 
Let f(n, k) denote the number of ways of selecting k objects with the restriction that 
A, and B, must not both be chosen when \i — j ] ^ 1. Then 


(4) 


/(n, A:) _ 4- / fc + 1 /n - 

^ 2 ‘fc \i)\k-ij- 


Proof. We split the acceptable selections into two subsets: those which 
include and those which do not. Let the latter be g{n, k) m number. 
Since the selections which include A„_i must omit B„_i and Pn- 2 , it is clear that 
they are g(n — 1 , A: — 1 ) in number. Thus 


( 5 ) f{n, k) = g(n, k) + g{n — I, k — 1). 

Similarly we split the selections which omit A„_i according as they omit or 
include B„_i ; we obtain 


( 6 ) g{n, k) = fin - 1 , A:) + gin - l,k - 1 ). 
Elimination of g from (5) and ( 6 ) yields^ 

(7) fin, k) = fin -l,k)+ fin - 1, k - 1) + fin - 2, k - 1). 
We can now make an inductive proof of (4) Assuming (4), we have 


fin, k) - fin - I, k) _ ..i k + 1 fk\ (n ~ i - l\ 

2 ^ 2 'k VA^ -1-1/ 

fin - 


2 *- 


/n-i-l\[k+t-lfk-l\^k + i-^ (k - l\ 

^ - 1/ L 2‘(A: - 1) \ i / 2‘-^ik - 1) “ V- ' 


In view of the identity 


A: + i/A:\_/c + z — i/A; — l\ . A; + f — 2/A: — l\ 
“F V^' / k- l' \ i /^ ' k - 1 \i - 1/ 


we now readily verify that the right hand side of (4) satisfies (7). To complete 
the induction we must check the appropriate boundary conditions. According 
to (4) we have 


fik, k) 
2 * 


= 2 :(-i) 

t-O 


V A: + i 

2'k 



= 0 , 


fin, 1 ) = 2 n — 2 , both as they should be 


“ This recursion formula i.s essentially the same as equation (20) in 12]. 



202 


inVlNG KAPLANSKY 


Note There are various other formulas for fin, k) ; we have selected (4) as 
it exhibits the asymptotic behaviour best- In an unpublished investigation 
John Riordan obtained a neat representation as a hypergeometric function; 

fin, k) = 2 (n - k)Fil - fc, 1 + fc - n; 2 ; 2 ) 

and derived corresponding recursion formulas. Essentially the same result 
was given by Wolfowitz [3]. Still another formula given by lliordan is 

A symbolic version is given in §5 of [2]. 

Proof of (2). From the formula of Poincar<5 ([ 1 ], eq. (29)) 

rlPin.r) = E (-l)*+^Af(n, fc)/(fc - r)! 

t-r 

or, in a cabalistic symbolic form, Pin, r) = We substitute the sue- . 

ceasive terms of ( 1 ) and we may let the sum run to infinity at a cost of 0 (ti“'") 
for any positive wi The first terra contributes* 

E (-l)'=+^ 2 */(fc - r)! - 2 " E (-2)7i! “ 

t-r <-Q 

Again since 

+ k = ik — r)ik — r — 1) ( 2 r + 2 )ik — r) + P + r, 

the next term yields 

E i~lf^\k'‘ + A) 2 ‘-V(ft - r)! = 2 'e-* ^2 - 2 r ~ 2 + 

and so on in obvious fashion. 

Some indication of the asymptotic behavior of Pin, r) is afforded by the fol¬ 
lowing table for n = 10 . It is to be noted that, because of the form of.(2), 
the approach to Poisson is much more rapid for r = 0 and 3 than for other r. 


r 

P (10, r) 

Poisson 

First two terms 
of (2) 

0 

.132 

.135 

.135 

1 

.300 

.271 

.298 

2 

.305 

.271 

.298 

3 

.179 

.180 

.180 

4 

.065 

.090 

.072 

6 

.016 

.036 

.018 

6 

.002 

.012 

.001 

7 

■ 000 

.003 

.001 


= My thanks are due to Mr. Riordan for correcting an error in this section, and for many 
helpful suggestions concerning the entire paper. 








DISTRIBUTION OF HUNS 


203 


REFERENCES 

[11 M FrAchbt, "Les probabiht6a aseocides a un systfeme d'6v6nements compatibles et 
dependants,'' Acluahlis Scienhfiguea el Induslnelles, no 859, Pans, 1940. 

[2] I Kaplansky, “Symbolic solution of certain problems in permutations,’’ Bail Amer 

Math. Soc , Vol. 60, (1944) pp 906-914 

[3] J WOLFOWITZ, "Additive partition functions and a class of statittical hypotheses,” 

Annals of Math Slat.,Yol 13, (1942) pp 247-279. 

[4] J WoijFOWitz, “Note on runs of consecutive elements, Annals of Malh. Slat., Vol. 15, 

(1944) pp. 97-98. 



ON THE APPROXIMATE DISTRIBUTION OF RATIOS 
By P. L. IIsu 

National Untuersibj of Peking 

The purpose of tliis paper is to apply Cramer’s theorem of asymptotic expan- 
sion^nd Berry’s theorem^ to study the approximate distribution of ratios of the 
following two typos: 

(I) Z =1 (Fi+ 

n /VI 

(II) Z = Y /- (Xi + •• • + X„) = Y/S. 

I m 

In (I) the X,, Y, are independent, the Yj are equi-distributed,’ and the X, are 
equi-distributed and positive. In (II) Zj, • • • , , Y are independent and 

positive, and the Z. are equi-diatributed 

1 . The ratio (I). Assume that (II) the absolute fcth moment of Z; and that 
of Y, are fimte and positive, where fc is a fixed integer >3, 

(12) the distribution of Z, and that of F,- are non-singular. 

Let 


f=«(Z.), u-«(Fi), <r’= e(Z') - r’«6(Fl)-,’ 


and 


C/ = :^(Z-a 

<T r 

Let f’(x), G{x) and H{x) be respectively the distribution functions of Z, V and 
y. Let 


- f" + I’Y 

\ m ^ nj ’ 


{ti — IJ 


Then the relation Z < a: is equivalent to 


zeU , rV ^ 
hy/m by/n ~ 


* H CKAM:6n, Random Variables and Probability Distributions (1937), Chap. 7. 

* A C. Behrx , "The accuracy of the Gaussian approximation to the sum of independent 
variates", Trarii Amer. Math. Soc., Yo] 49 (1941), pp 122-130. 

^The Y, are said to be equi-distributed if allY/ have the same distribution function. 



DISTRIBUTION OF RATIOS 


205 


For simplicity we shall assume a; > 0; the results are, however, general. Then 

. x<tU , tV 

the distribution functions of — 7 — 7 =- and are 

by/m by/n 

Hence, by the theorem of convolution. 

Here we recall the theorems of Cram4r and Berry: Under the conditions (II) 
and (12) 


( 2 ) 

where 


G(:r) =$(a:) + i:^^+ 


i—i m 


m 


,4(1-2) > 


3.(x) = e-'-'’ dy, P,{x) = i: cy,$''+'’>(x). 


and I Ut 1 is less than a positive number which depends only on fc and the distribu¬ 
tion of X,. If fc = 3, condition (12) may be removed * 

Analogously, 


(3) 


H{x) = $(x) + 




E 

y-i 


Qpix) 


n 


r/i 


+ 


D'm 

^4(*-2) > 


whete 

Q.ix) = E 

j-i 


In the sequel we shall use the letter A* to denote an unspecified quantity such 
that 1 A* 1 is less than a positive number -which depends only on fc, the distribu¬ 
tion of X* and the distribution of Y ,. 

Using (2) we have 


(4) 


1-3 


1 - G{-x} ~ $(x) + X) 


i-iyPrix) 


+ 


D, 


and this making this substitution in (1) we get 



* This last assertion constitutes Berry’s theorem. 



206 


P. L. HfaU 


and so by partial integration, 


r(i) = 

+ ‘ft§'r riC’-y^bLzJl) dp.C'-y^) 

jii—l J~tn \ T / \ / 

Making the transformation y = <rzv/b-\/m and writing 


a-ynx 


we get 


Fix) = f Hidu ~ dv 4- E f H(au - ^v)PUv) du + 


= j 4-4. 

For Jo we use (3) and obtain 


lo - f 4'(«u ~ dy + E 4ri [ “ /3ii)^'(v) dv + 

<>~IC ,|_1 n ' <i~n 


For I, we use (3) with /. leplaced by k — v. Thus 


J, = / f (oru - Pv)Pliu) dv + E -~2 / ~ MP'M dv + ~. 

'<-» , 1-1 J-K TV 


Alt 

Kt-J-i') 


Combining these results we get 


(6) F(x) = [ ^v)^'(v) dii 4- S 4-0 f - i3y)4>'(y) dy ’ 

•*-« /I ' *L.ao 

+ S ‘s’ f.- 


where 


R, = At- 4, Ai_ 4 - y 






Now by (5), a > 0 and — /3’ = 1. For such values of a and however, it 
follows easily from the theorem of convolution that 


f i‘(otu — (3v)$'(v) dv = 

•L-bc 



DISTRIBUTION OP RATIOS 


207 


As differentiation under the integration sign is justified by the boundedness of 
the derivatives of we have 

c? — ^v)^'(v) dv = 

Repeated partial integration then gives 

r dv = r - fivMv) 

tt to 


dv 




1 


Hence 


f Q,(au — fiv)i'(v)dv = ^ dj, f — /3v)i'(v)dv 

X>ao ^<“1 *^00 


-1 


j -1 a 


f 9(oeu - l3v)P',(v) dv = S cj, f ^(au - dv 

J-m I“1 •*-« 




f Qi,(ctu — fiv)P',(v) dv = 

JL«e {-.1 


.r+V 


Making all these substitutions in (6) we obtain the final result 

/I 1 

If fc = 3, the result remains true without the condition (12). 


2 . The ratio (H). Here we make the following assumptions: 

(111) The fcth moment of X, is finite and positive, where fc is a fixed integer 
> k, e(X.) = 1‘ t(A!) ^ 1 = <r’. 

(112) The distribution of Xj is non-singular. 


■As the case c (X,) — 0 is excluded, there is no loss of generality in this assumption. 



208 


P. L. HBXI 


Let U = Vm(^ — l)/<r, and F(i:), G{x) and //(a:) t)e respectively the 
distribution functions of Z,- U and F. Then 

r <rxU 

Fix) = Pr|F - . 

Because of the positivenesa of X, and Y we may always assume a; > 0. Then, 
by the theorem of convolution, 


"w - £{i - K'— 

Using (4) we have 


diKv). 


Fix) = rU(:s^ —+ E 


(-I)' p ( Vm jx - y) 


'H 


(rX 


)} 


dHiy) +^*; 2 ), 


where, as throughout the rest of this pairer, At. represents an unspecified quantity 
such that I alt I is less than a positive number depending only on fc, the distribu¬ 
tion of X, and the distribution of F. By partial integration we get 


m 


A, 


(7) F(x) =£“’;/(* + ^ + 

\ (xa; / ,_i j 

An interesting special case is the following: Suppose that (II3) //‘*~'\a:) 
exists and is continuous for all a: > 0; (114) the functions 

Ux) = iv = 1, - 3) 

are bounded, i.e. 

f,(x) = Ak ; 

(II3) there is a positive constant c < 1 such that 

= At 

for all X > 0 and (1 — c)x < j/ < (1 -f- c)x. Under these conditions we have 

/ (TXZ \ 

V~V^)^ 




H 


►-0 


(-l)V^x^Fjr’(x) 

v\ 


+ TAr-2)rmH‘-5r-^^ 


and so, for 1 21 < wo have 

(T 


(8) 







DISTRIBUtlON OF RATIOS 


209 . 


Separate now the integral in (7) into two parts: 


?! = / 


^ c\/mf9 


= f . 


|/.|</ ^ + 

Evidently this last integral is exponially email and so is By (8), 




-djb 


Combining these results we obtain 




= i: + E + 




where 


/afl = f dz. 

•LoO 


Now the following facts can easily be established by means of partial integration; 

(9) I «0 = 0 when a — ^ is even, 

(10) la^ = 0 when /3 — a > 1. 

By (9), the non-vanishing terms in ^ are the even terms and the non-vanishing 

1 

terms in 23 are those for which m + »' is even. Hence 
2 




E= 23 


0 m" 


14(1-3)1 [l(fc-3)] . U{t-01 [i(«;~4)] 2)1+1 / t 

Z _ V V V «>■ r I V' V Y' W'+i T 

- Lj 2-1 2-1 TT, ■i2,.2,,+2j+i + 2 ^ 2-1 2-1 •z^^+^+r • 

2 wO )i-i j-1 )i*.o /-I Wr^ 



210 


P. U BSU 


Using (10) to reduce further we get 




> F-J ^-1,-1^"+' ^ £l po 


ito «*'+'+> ;“j ^x+F-M 

H{M)1 


(l(*~»)l, 1 

= I ^ 


«-o m”+‘^-.[|(4+i)i 


A II(M)1 , « 


Hence 

£ + I = f, + !!6 + !li±^, “'f;”'! 

1 * mm* m' 


F-J W' 

^«Ff*F+ E -f '' 

M-lf^-l)l ^[, 


Ak 


1 2f , 

= {■ + £- £ »,,{,+A. 

F^ m'/iTfi ^^ ' 






Hence 



Oar M ancluiion ij ; Under the conditions (m)-(IB) formuU (11) is true; 
u « 3, (11) remains true without the condition (112), 



ON THE DISTRIBUTION OF THE SERIAL CORRELATION COEFFICIENT 

By Herman Rebin 

Cowles Commission for Research in Economics, University of Chicago 

Thje distribution of the serial correlation coefficient, in samples dra^m from 
a parent distribution with zero serial correlation, has been studied by many 
authors. Anderson [1] obtained the exact distribution. Dixon [3] and Koop- 
mane [4] have given approximate distributions, each attained by smoothing the 
characteristic values of the numerator of f in (1) below. Dixon smoothed the 
characteristic values in the generating function and obtained his results by 
comparmg the moments of the exact distribution with those of the approximar- 
tion, of which the first T are found to be exact. Koopmans smoothed the 
characteristic values in the exact distribution function. Here we evaluate 
Koopmans result and show that it is the same as Dixon’s approximation. It 
thus appears that in this case it is immaterial whether the characteristic values 
are smoothed before or after mverting the characteristic function. We also 
add Tables comparing confidence limits for the exact distribution, for the ap¬ 
proximation referred to, and for a normal approximation. 

We define the serial correlation coefficient as 


( 1 ) 


r = 


^ ^ X|Xn-i 
1-1 _ 

ix] 

t-1 


, Xt+1 = * 1 . 


Then Koopmans obtains, if the true value p of f equals 0, and the Xi are nor¬ 
mally and independently distributed with mean 0 and variance ir*, the ap¬ 
proximate distribution T/2 — 2. 

( 2 ) Kf, T) - « 


/ 

Jo 


(cos a — ’ sin ^Ta sin a da. 


Although in the distribution problem T is a positive integer, it is useful to 
consider the right-hand member of (2) as the definition of Tiif-T) for those 
complex values of T for which it exists. 

Let R{T) denote the real part of T. If RiT) > 2N + 2, we obtain 

d' ... - 1 ) 


(3) 




Uf, T) 


(iT - 2)(ir - 3) (JT - W -- 1) 


f kco eo« f 

(cos a — ein \Ta sin a da. 


Now, according to [2], tables 41, 42. 

(cos ain \Ta sin a da 


f 

Jo 


hT-K 

- 

211 


r(^r - N - 1) 


r(i(T - N+ i))r(Ki - AO) 


(4) 



212 


HERMAN RUniN 


Deonote by ^^"’(0, T) the value of ^ li{f, T) for f ~ 0. Then for R(T) > 

2^ + 2, 


(5) 


h^''> (0, T) 


{~lf2''T{hT + 1 ) 

r(Kr - + i))r(Ki - A^))' 


H{f, T) is analytic in f for | f | < 1, U{T) > 2, and is analytic in T for | f | < 1 , 
R{T) > 2. It follows by Hartoga’s theorem [5] that T) is analytic in f 
and T for I f I < 1 , R{T) > 2. By analytic continuation we get that (5) holds 
for R{T) > 2 , Consequently 


( 6 ) If N is odd, ^'">(0, T) = 0; 

(7) if N is even, 

^W(o, y) 2*'r(iT + ^)rQ) 

MO, T) nUT ~ N + i))r(Ki - N)) • 


Let N = 2 P, then 
1 (0, T) 


( 8 ) 


( 2 F)! M0,T) 

_ {-If 2^’' (T -\\(T - 3\ /r - 2 P + lYl-3-* (2P 

( 2 P )1 A 2 )\ 2 /■'A ' 2 ' A 

_ (-If (T - IVB - 3^ Y2’ ~2P + l\ (2P)\ 


- 1 )^ 


2 '’ 


(T - X\/T - 3^ /'i 

(2P)! V 2 A 2 A'A" 2 

- j!... - (1 _ 

L{d(r«)r^ ^ Jw- 




PI 


A 

pi 


According to (5) 

(9) Mo, T) 

Hence 


rgT + 1) ^ 1 ^ 

r(^r + i)r(i) /A (i - df‘ 


( 10 ) 


Mf, T) = 


+ i)(i - 

r(i T + i)r(i) ' 


which is the same as Dixon’s expression (3.22). 

A more elementary proof by complete induction for integral values of T can 
be based on the recurrent differential equation (14) which is of interest in itself. 
To this end we shall write (2) in a different form which is easily obtained through 
partial integration. 

oir.lm ratii coi r . 

(11) Mf, r) = - / (cos a — COS ^2’ « da, 

IT Jo 



SEHIAL CORRELATION COEFFICIENT 


213 


Differentiating with respect to f, 


If (f, T) = (cos a - cos ^ Ta da. 

TT Jq 

/taro ooB f 

= -tiili- Ul. I (cos « - (cos - 2) « cos a 

TT Jfl 


— sin §(r — 2) a sin a) da 


= r ^ (cos « - r-)‘- cos i(r - 2)a da 

TT Jo 

- r ^ (cos a - cos i(r - 2)a da 

T Jq 

, miT - 1 ) 2 ‘' r‘~ '', _ 


J Asro 008 r 

I (cos a — ff 

0 


•sin ^{T — 2)a sin a da 


= l)2»-f p-- 

(13) •'» 

•cos i(r — 2)a da, 

because the first and third terras in (12) cancel as may be shown by integrating 
by parts. 

Hence (13) reduces to the recurrent differential equation 

(14) Ji'if, T) = -2-^Tfli{f, T - 2). 


Let us now assume that 


'h(f T — 2) = (1 _ 

h{r, 1 2) U ) 


Then (14) becomes 


Ji(f T) = ~2f-\T~ __EilTj-(1 - 

_ of.MT - 11 ■ (1 - 


Integrating, one obtains 
(17) Hr, T) 


+ 1 ) /I _ =A!C- 2 ) 

T[hf +-i)r(^) ^ 


No constant of integration occurs because (17) agrees with (5) for f = 0 and 

N = 0. 



214 


HERMAN RUBIN 


It remains to prove the validity of (17) for the initial values 7 = 3 and T = 4. 
If r = 4 


TT JQ 


(18) 

For r = 3, 


8 . 3 

sm 


sin 2a sin a da 

arc ooa f 


3^ 


“/ 

Jo 


_8 _ -sw _ I’(3) _ 

Stt'' ^ r(t)r(|) ^ ^ 


(19) = 

ir Jo 


■” " sin fa sin a 


■\/cos a 

Substitute cos a = f + (1 — f) sin’ d. We get 
2(1 - r) 


da. 


K(f, 3) = 


( 20 ) 


1(1 - n = 


f {(1 + 2f) cos’ e -f 2(1 — f) sin’ ff cos’ 
Jo 

r(S) 


dB 


r(2)r(i) 


(1 - ?^) 




which completes the proof. 

A short table of confidence limits is included, corresponding to the 5% and 
1 % significance levels, comparing the exact distribution given by Anderson [1] 
(the values in parentheses being graphically interpolated by him), the distribu¬ 
tion (10), and the normal curve with the same mean and standard deviation. 


Confidence limilefor f 


T 

6% 

1% 

Exact 

(10) 

Normal 

Exact 

(10) 

Normal 

3 

,864 

.729 

.730 

.970 

.882 

1.040 

4 

.713 

.669 

.672 

.898 

.833 

.950 

r> 

.622 

.621 

.622 

.823 

.789 

.879 

6 

.570 

.582 

.682 

.762 

.750 

.823 

7 

.545 

.549 

.648 

.714 

.715 

.775 

8 

(.521) 

.521 

.520 

(.682) 

.685 

.736 

9 

.498 

.497 

.496 

.056 

.058 

.701 

10 

(.477) 

.476 

.476 

(.033) 

.634 

.672 

11 

.457 

.458 

.450 

.612 

.012 

.045 

16 

.400 

.400 

.399 

.643 

.543' 

.504 

20 

(.351) 

.352 

.351 

(.480) 

.482 

.490 

25 

.317 

.317 

.317 

.437 

137 

.448 

30 

(.291) 

.291 

.291 

(.404) 

.403 

.411 

35 

(.271) 

,271 

.270 

(.377) 

.376 

.382 

40 

(.255) 

.254 

.264 

(.355) 

.354 

.359 

45 

.240 

.240 

.240 

.335 

.335 

.339 



SERIAL CORRELATION COEFFICIENT 


215 


It is thus seen that the distribution (10) provides satisfactory significance levels 
for r > 9 whereas the normal approximation provides satisfactory 5% signif¬ 
icance levels for the same range. The normal approximation appears to be 
unsatisfactory, however, at the 1% significance level even for T as high as 45. 
The normal approximation here used is not the same as that used by Anderson 

“k/ ^ T 

([1], p. 53), which assumes . ■ = to be normally distributed. 

"v 1 -|- 2f* 

The following table shows a comparison between a few more confidence limits 
of the Type II curve (10) and the normal curve mth same first two moments 
for a few values of T. 


Confidence limits for f 


T 

5% 

4% 

3% 

2% 

1% 








(10) 

Normal 

(10) 

Normal 

15 

.400 

.399 

.423 

.425 

.452 

.456 

.488 

.498 

.543 

.564 

20 

.352 

.351 

.373 

.373 

.398 

.401 

.431 

.438 

.482 

.496 

25 

.317 

.317 

.336 

,337 

.360 

.362 

.390 

.395 

.437 

.448 


REFERENCES 

[1] R. L. Andbbson, Serial Correlation in the Analysis of Time Series, unpublished thesis, 

Iowa State College, 1941 

[2] D Bierens de Haan, Nouvelles Tables (Vlnlegrales Definies, Leyden, 1867. 

[3] T. Koopmans, “Serial Correlation and Quadratic Forms in Normal Variables”, Annals 

of Math Slat., Vol 13 (1942), pp. 14-33 

[4] W. J. Dixon, “Further contributions to the problem of serial correlation", Annals of 

Math Slat., Vol. 14 (1944), 

[6] W F. Osgood, Lehrbuch der Funclionenlheorie, Vol. 2. Part 2, Leipzig, 1907. 



















NOTES 

This section is ievoted to Irief research and expository articles, notes on meihodology 
and other short items. 


A NOTE CONCERNING HOTELLING’S METHOD OF INVERTING A 
PARTITIONED MATRK 

By F. V. Wauqh 

War Food Admimsiration, Washington 

Professor Hotelling recently presented several methods of computing the 
inverse of a matrix/ Among these was a method of partitioning a square matrix 
of 2p rows into four square matrices, a, h, c and d, of p rows each, resulting in 
the partitioned matrix, 

fa h 

The inverse of this matrix can also be written as a partitioned matrix, 

~A Cl 

-DJ' 

Then, multiplying the original matrix by its inverse we get four matrix equa¬ 
tions, 

aA hB = I aC + hD — 0 

cA + dB = 0 cC A- dD = 1. 

These equations can be solved for A, B, C, and D, 

Professor Hotelling’s solution requires the inversion of four p-rowed matrices. 
It IS possible, however, to solve these equations by formulas involving only two 
inversions. The formulas are 

D = (d - co“‘b)"^ B = -Dca'^ 

C = -a~‘bD A = a“' - oT^B. 

As an example of the procedure let the given matrix be 


26 

-10 

15 

32' 

19 

45 

-14 

-8 

-12 

16 

27 

13 

32 

29 

-36 

28. 


1 Harou) Hotelung “Some new methods of matrix oalculation,” Annala of Math. 
Slat , Vol 14 (1043), pp 1-34. 


216 



INVESTING A PARTITIONED MATRIX 


217 


The necessary steps in computation are 


.03309 .007351 
-.01397 .01912] 

-.62060 .217721 
.65375 .78968] 


a~% = r .39345 1.00008' 

[-.47723 -.60000. 

car% = r-12.35708 -21.60096 
-1.24927 14.60256 


Note that a convenient check at this point is to compute both 
{ca~^)h and c(a“^5) 


' 39.35708 34.60096“ 

_-33.75073* 13.39744. 


(d - co~‘b)"' = D = 
-oT^bD = C = 


-Dca"' = B = 


- oT^B = A = 


'.00790 

-.020411 

..01991 

. 02322 ] 

-.02302 

-.015191 

.01572 

. 00419 ] 

.01825 

.01440] 

-.00282 

- 02267] 

.02873 

.02436] 

-.00696 

. 01239 ]' 

e matrices 

are the four parts 


written 


.02873 .02436 -.02302 

-.00696 .01239 .01572 

.01825 .01440 .00790 

-.00282 -.02267 .01991 


-.01519 

.00419 

-.02041 

.02322. 


The accuracy of the computations can be checked by multiplying the original 
matrix by the computed inverse matrix. The product should, of course, be a 
close approximation of the identity matrix. If further accuracy is called for 
we can use Hotelling’s iterative formula, 


Cl = C»(2 - ACo) 

where Co is the estimated inverse; A is the original matrix; and Ci is a second ap¬ 
proximation of the inverse. 



NEWS AND NOTICES 

Readers are inmted to submit to the Secretary of the Tt.etiLuU news items of interest 

Personal Items 

Professor W. G. Cochran of Iowa State College has gone overseas as a con¬ 
sultant for the United States War Department, 

Professor A. R. Crathome of the University of Illinois has retired with the 
title of Professor Emeritus. 

Professor William Eeller of Brown University has been appointed Professor 
of Mathematics at Cornell Universityj Ithaca, New Tork, as of July 1,1945. 

Associate Professor Joe J. Livers has returned to Montana State College at 
Bozeman after receiving his doctorate in February at the University of Michigan. 

Assistant Professor W. A. Vezeau of the University of Detroit has been ap¬ 
pointed Assistant Professor of Mathematics at St. Louis University. 

Associate Professor S. S. Wilks of Princeton University has been promoted to 
a professorship. 

The American Statistical Association elected ten Fellows during 1944. Of 
these ten, five are members of the Institute, They are A. E. Brandt, W. G, 
Cochran, Gertrude M. Cox, Alan Treloar, and Sewall Wright. The President 
of the Association is Dr. Walter A. Shewhart, a charter member of the Institute 
and its President during 1944. 


New Members 

The following persons have been elected to membership in the Institute: 

Allendoerfer, Asso. Prof. Carl B. Ph.D. (Princeton) Haverford College, Haverford, Pa. 
Beckstead, Lt. (j.g.) Gordon L. M.S. (Michigan) Aerologist, U. S. Navy. Aerology, Navy 
#161, o/o Fleet Post Office, San Francisco, Calif 
Bennan, Abraham J. M..^. (Brooklyn) Statistician. iJfBO College Avenue, New York, 
N. Y. 

Bigelow, Julian H. Asso. Director, Statistical Research Group, Columbia University. 
401 West 118th St, New York 27, N. Y. 

Bowen, Earl K. A.M (Boston) Instr Math. Northeastern Univ , Boston, Mass. On 
military leave—Scientific Consultant, Office of Field Service, O.S.R D. 6 Srbiep Ave., 
IF. Springfield, Mass. 

Canter, Stanley D, B.S. (Coll, City of N Y.) Statistician, Lerner Shops, Inc., New York, 
N. Y. S676 Morris Aue., The Bronx, S8, New York, N. Y. 

Cohen, Karl. Ph D. (Columbia) Physicist, Standard Oil Development Co, Esso labora¬ 
tories, Research Division, P, 0. l^x 243, Eliiabeth B, N. J. 

Cooper, William W. A B. (Chicago) Instr. in Economics, University of Chicago. 6539 S. 
Ellis Ave., Chicago 37, Ill. 

Davidson, James H. B.S. (Norwich Univ.) Research Physicist, Hercules Powder Co. 
Box SU, Christiansburg, Va. 

Epstein, Benjamin Ph.D, (Illinois) Staff Assistant, Westinghouse Electric & Mfg. Co,, 
Quality Control Dept., Rm. 3-A-17, East Pittsburgh, Pa. 

218 



NEWS AND NOTICES 


219 


Gauthier, Prof. Abel A M. (Columbia) Prof, of Mathematics, UniverBitfi de Moatreal, 
2900 Mount Royal Blvd., Montreal, Canada. 

Gersten, Lydia Blumenthal B A. (Hunter) Res. Stat. 1001 Lincoln Place, Brooklyn IS, 
N. Y. 

GoBman, Casper Ph.D. (Ohio State) Staff Asst., Quality Control Dept., Westmghouse 
Elec. &Mfg. Co., Rm. 3-A-17, East Pittaburgh, Pa. 

Hastay, Millard W. B.A. (Reed^ Asso. Math., Stat. Res. Group. Columbia University. 
401 West 118th St., New York 27, N. Y, 

Houseman, Earl E. M.A. (South Dakota) Head Sampling Sec , Stat., Division of Program 
Surveys, Bur. of Agric. Econ , Washington 25, D. C. 

James, R. W. M.A. (Toronto) Asst, to Director, Washington Div., Wartime Prices & 
Trade Board. Room 3068, Railroad Retirement Bldg., Washington, D. C. 

Jones, Robert Richard, Jr. A.B. (Columbia) 81 Jackson St., New Rochelle, N. Y. 

Kac, Asst. Prof. Mark Ph.D. (John Casitnir Univ., Lwow) Math Dept., Whitehall, 
Cornell University, Ithaca, N. Y 

Knoepfel, Margaret F. A.B. (Brooklyn) Jr. Stat, Weather Bureau, Washington, D C 
SS06 Ely Place, S E., Waehinglon 19, D. C, 

Ladd, Robert Boyd M.A. (Texas Coll, of Arts &. Industries) Stat. Consultant, OCT, 
Transport Economics, Traffic Control Div., War Dept., Washington, D. C. OOS Wade 
Ave., Rockville, Md. 

Larson, Charles M. B.Sc. (Nebraska) Stat. Analyst, Northrop Aircraft, Inc. S144 West 
ItBth Si., Hawthorne, Calif. 

Lesansky, William A. B B.A. (City Coll, of N. Y ) Stat, War Dept., Washington, D. C. 
1841 Summit Place, N.W., Washington 9, D, C. 

Lewis, Wyatt H. B.8. (Calif, Inst, of Tech,) Quality Control Engineer. tl9 East H 
Street, Ontario, Calif. 

Mathlsen, Ensign Harold C. A B. (Princeton) Ensign, USNR. 69 Fernviood Road, East 
Orange, N. J. 

Miller, Robert Carml Res. Engineer, Elgin National Watch Co., Elgin, Ill. 

Mlttra, Probodh Chandra B.Sc (India) Grad. Student in Math. Stat., Columbia Uni¬ 
versity, New York 27, N, Y. 

Heumann, Prof. John von Ph.D. (Budapest) Institute for Advanced Study, Princeton, 
N J. 

Noland, Asst. Prof. E. William Ph D. (Cornell) Dept, of Sociology & Anthropology, 
MoGraw Hall, Cornell University, Ithaca, N. Y 

Okun, Yetta Edith B.A. (Hunter) Res Asst., Dept, of Labor, Washington, D C ISltO 
16ih St , N W , Washington 9, D C 

Owen, F. V. Ph.D. (Wisconsin) Geneticist, U. S Dept, of Agric. 1810 S. Main St., Salt 
Lake City, Utah 

Poston, Paul Lehman B S. (California) Statistician. George Washington Carver Hall, 
211 Elm St., Washington D. C. 

Ripe, William B. A.B (Davidson) Director, Dept of Stat. & Reports, Plomb Tool Co. 
908 Baldwin Ave,, El Monte, Calif. 

Rudnlckl, Alex. B.S (City Coll, of N. Y.) Grad. Student in Math, Stat 107S Lorimir 
St., Brooklyn H9, N. Y 

Rapp, William B. Mgr., Quality Control Dept., RCA Victor Div., Radio Corp. of America, 
Harrison, N. J. t9 Dodd St., East Orange, N. J. 

Savage, Leonard J. Ph.D. (Michigan) Res, Math,, Stat Res. Group, Columbia Uni¬ 
versity, 401 West 118th St., New York 27, N. Y. 

Sheppard, David B.S. (Yale) Statistician, Army Air Forces. i7il Terrace Road, S.E., 
Washington BO, D. C. 

Smith, Prof. James Gerald PhD. (Princeton) Prof, of Economics, Princeton University. 
80 Murray Place, Princeton, N. J. 



220 


NEWS AND NOTICES 


Stlgler, Prof. George J. Ph.D. (Chicago) Prof, of Economica, Member, Rea. Staff, Na¬ 
tional Bureau of Econ, Rea , Univeraity of Minnesota, Minneaiwlia, Mmn. 

Weingarten, Harry M.A. (Columbia) Math. Teacher, School of Aviation Trades ]S30 
Morris Avc., Bronx B 6 , M. Y 

Weinstein, Joseph M.S. (O.C N. Y.) Ros. Analyst, Vacuum Tube Teats <fe Standardiza¬ 
tion, Camp Evans Signal lab. Signal Corps. IS WashingLon Village, Asliury Park, N. J. 

Westinan, A. E. R. Ph D. (Toronto) Dir. of Chem. Rea., Ontario Research Foundation, 
43 Queen’s Park, Toronto S, Canada 

Wilcox, Sidney W. L.B. (California) Chief Stat, Bur. of lalxir StaL. lioom 2318, Dept, 
of Labor, Washington 25, D. C. 

Young, Captain Chen-Pang B.A. (National Tsing Hua Univ., Chinal Ordnance Dept,, 
Chinese Army. SSil Massachusetts Ave., N. W., Washington 8 , I). C. 

Corrections to the Directory Published in the December lb44 Issue 

The name of Dr, Walter Schilling was omitted from the Directory. It should 

have appeared as follows: 

Schilling, Walter M.D. (Harvard) Asst. Clinical^ Profea^ior of Medicine. 

Stanford University Hospital, San Trancisco 15, California 

The name of Professor Godfrey H. Thomson, Director of the Training of 

Teachers, University of Edinburgh, Edinburgh, Scotland, was misspelled. 



CHOICE OF ONE AMONG SEVERAL STATISTICAL HYPOTHESES 


By Ralph J. Biiooknkk’ 

New York City 

1. Introduction. Statistical decision is a tcnii whicli wo will apply to that 
phase of statistical infeicnco which deals with the following question. Con¬ 
sider one 01 several variate.s whose, distribution fuiiciion deiicnds on one or 
several unknown parameter.s; suppose there be given a Unite number of mutually 
exclusive hypotheses legarding the parameteis, whose totality completely ex¬ 
hausts every possibility. If a sample of observations on the vaiiates is made, 
the choice of one of the given hypotheses on the basis of that sample is called a 
statistical decision. In other words, to make a statistical decision is to give a 
procedure which will divide the sample space into as many region.s as there are 
given hypotheses, and to set up a one-to-one con espoudence between these 
regions and the hypotheses so that if the sample ])omt lies in any particular 
region, the corresponding hypothesis is cho.sen 

This notion is quite closely connected with both of the fields of statistuail 
infereiKie that have engaged mo,st of the modern statistical theorists On the one 
hand, it may be considered a generalization of the notion of testing hypothese.s, 
for in this theory, one gives a procedure which divides the sample space into a 
region of icjcction and a region of non-rejection of a given null hypothosis. 
Then one makes cither of two decisions depending upon which of the regions 
contains the sample point On the other hand, the theory of estimation is a 
generalization of the notion of statistical decision in which the number of alterna¬ 
tives Ls not restricted to be finite 

A,s in any phase of statistical inference, our primary aim is to define broad 
principles upon which “good” oi "best” procedures for making .statistical deci¬ 
sions may be based. The general problem of statistical decisions has been formu¬ 
lated by A. "VVald, who has also proposed a pruiciple on which the solution can 
be based. AVe are interested, however, in several of the .simpler but important, 
particular problems in which (luite serious calculation difficulties are encountered 
in actually finding Wald’s solution Hence, we will propose m its stead another 
principle which quite closely I'cscmbles Wald’s for selecting a solution of tlie 
problem of statistical decision. 

It may lie pointed out immediately that, from a purely logical jioint of view, 
tiho substitute principle we shall offer will probably lie considered to be less 
acceptable than ifs predecessor. Wo will find, however, liy considering ils 
application to some of the well known problems of testing hypotheses, that, the 
principle is at least reasonable in leading to ccrt,ain v^ell accepted results. 

' Reaearch under a grant-iu-aid of the Carnegie Corporation of New York 

221 



222 


IlVI.PH J. imOOKNKH 


2. Principle determining the “best" procedure. We ^vill fiiwt diKcuas briefly 
Wald’s principle and the definition of the criterion that we will employ will be 
accomplished by pointing out the difference.H. A much more general formula¬ 
tion is possible, [ 1 ], 12 ], but wc will diRC.usR the principle aa it will be directly 
applied to the prohloms of statistical dccisiouH wlam the numbor of hypotheses 
is finite 

Consider the vaiiales Ji, Xo, ■ • ■ , Xp whose prohability density function 
f(xi, Xp I di, $1 $ 1 ,) is known except for the unknown values of 

the parameters , Oi, • • • , 6i,. We denote by d a point in Ar-dimcnsional 
space whose coordinates are (^i, , • * ■ , 6k) and shall ajieak of this parameter 

space as £2. Suppose that w is any subset of £2 and that tSi represents a system 
of finitely many such sets which are mutually disjunct and which cover £2. 
Each element, wq , of 5 coiTCSponds to a hypothesis , which is the hypothesis 
that is a point of wo, and the system of all such hypotheses corresponding to S 
we denote by 7/a . 

A sample of N observations on a:i, * 5 , ■ • • , aip is drawm and the sample may be 
considered as a point, E, in the piV dimensional sample space; denote the sample 
space by M. We want to decide on the basis of the point ?J which of the hy¬ 
potheses of Hs should be accepted. That is, wc .sock a procedure by which the 
sample space may bo divided into a system of mutually exclusive regions Mu 
which arc the same in number as the numlwr of elements of S, and by which a 
correspondence is set up so that the falling of the sample point into a particular 
Muj shall cause us to accept a particular hypothesis //„, as the true one. If 
the totality of regions Mo be denoted Ms 1 it is ncceasaiy to give a principle by 
which we may prefer a particular system Ms over any other system Ms . 

Wald introduces the notion of a weight function of errors, a function of the 
parameters and of the decision made, whicli might well be defined as the loss 
incurred if 6 be the true parameter point and the sample point falls in Mo which 
causes us to accept the hypothesis ffu ■ Denote the weight function by W( 6 , w s) 
where ws stands for that hypothesis which we choose if E is the sample point; 
then we require that W{ 6 , us) be non-ncgative, and if 6 lies in wa, W(fl, ujs) = 0 
for then the correct decision has been made and there is no loss. 

Perhaps the notion of a weight function can be most clearly undei-stood, and 
its importance appreciated, if we consider the place of 8 tati.stics in the business 
world, where possible losses are often computable in terms of money. The 
weight function may be taken to be equal to this loss. Buppose a manufacturing 
plant has a process which manufactures a product whoso efficiency is a measurable 
quantity that wc will denote by x. Suppose x is a random variable W'hoso distri¬ 
bution depends only upon its mean value B, and the company contemplates 
lenewing its machinery if the mean value of the efficiency falls short significantly 
from a particular value 60 . Then on the basis of a sample of N observations on 
•f, one of two decisions must be reached: the rejection of the hypothesis B ^ do 
(the decision to renew the machinery), or the non-rejection of 0 ^ ^0 (the decision 
not to renew it). Suppose the region Mu is the region of the .sample space such 



STATIBTICAIi HYPOTHKSES 


223 


that if E falls into , we reject 6 ^ da and Ma, is the complementary region. 
Then we may say that the weight function can be defined by 

W{d, w) = 0 fcr 0 ^ (9o 

WiO, u) = g(9) for 6 < do 

W(9, u) = 0 for 6 < do 

ir(0, w) = h(d) for 6 ^ Go 

where h(d) is the company’s monetary loss in needlessly changing its machinery 
and g(6) is a function which expresses the company’s loss in not changing its 
process even though the true value of the parameter is 0 < 00. The function 
£ 7 ( 0 ) may be of almost any form, but it is only reasonable that it should be a 
monotonic non-decreasing function of |0o — 0|, since the loss should, it seems, 
increase as the true value of 0 is farther from 0o. 

Wald then defines the risk as the expected value of the loss; since 0 is an un- 
knoMTi, the risk will be a function of 0, and it will also be a function of the system 
M, : 

r(d, Mb) = f W(e,o>^)-f(E\e) dE. 

According to Wald, the “best” system of regions, Ma , is that system for whicli 
the maximum of the risk function with respect to the paiametcr 0 is a minimum 
with respect to all possible systems, Ms , of regions. Several important proper¬ 
ties are enjoyed by the system of legions defined in this way, though other 
reasonable definitions aic possible Perhaps the criterion of minimizing an 
average with lespect to 0 of r(0. Mg) rather than the maximum may be con¬ 
sidered more plausible, but such definitions would raise the question of which 
average should be used, and the result obtained by using any particular average' 
would not be invariant with respect to transformations of the parameter space. 

Usmg the notations as introduced above, and introducing the notation W(0, w() 
to be the weight function if the ith hypothesis is chosen, the principle which 
we will use to solve some of the problems of statistical decisions can be given as 
follows: In place of the risk function, we consider the s functions 

P,(0, E) = 17(0, w.)-/(j0|0) {i = 1, 2, • • ■ , s) 

where f{E | 0) is a notation for the probability density, and s is the number of 
given hypotheses. If wo denote by R,(E) the least upper bound of Ei(G, E) 
with respect to 0, then we choose the system of “best” regions of acceptance by 
mckiding each sample point E in a region. Mi determined such that for all Eo in 
M ,, rt,{Eo) ^ Rj{Eo)for allf ^ i. 

It IS interesting to note that a rather general case exists in which the principle 
is exactly equivalent with the test of a hypothesis based upon the likelihood 
ratio principle. Consider the distribution function /(xj, .xs, • • • , Xp j 0i, 0^, 

’ , 0fc) which IS a bounded function of the .x’s and 0’s. Suppose we are in¬ 
terested in the test of the hyi)othe.sis (0i, 02 , • ■ • , 0;,) £ w ivherc w is a closeil 



224 


luti'ir j. nuooK.NrKu 


set of points of th(^ piirameter space which (1 (M‘s uol conlaiii any open subset of 
the parameter space Fart]i(*rmore assume that for each set of .r’s th(> distribu¬ 
tion function is contiiuunis in 0i, ■ • , Oi. on an oiien suii.set of il containing co 

We will show tliat tlie primaple will lead to the test liased on the liki'liliood 
ratio if the following is the iveight function: 

T. If 0 ) is accepted, the loss is zero if the tiiie parameter point is in to, and the 
loss is a constant n if the true iiarameter ])oint is not in w. 

If If w is rejected (i.e. is is ehosen), the lo‘<s is zrn'O if tlie true iiarameter 
point is in io and is a constant fj if the. true parami'tor point is in w. 

(Jonsider then the, region of the sample space for wliich to is rejected according 
to the principle. This region is that for which 

l.u.b. w'.r.t. 8 in w of [cij(x \ 9)] < l.u.b. w.r.t. 9 in w of fri/(r | 9)] 

ivheic we have set/Cc [ 9) = /(u'l, •'Ci, • • • , j:® | 9j., Oj, ■ • , B^), and when* l.u.h. 
wr.t. raean.s “least upper hound with respect to.” lint the left-luiml member 
of this inequality is eipial to 

c-ijl.u.b, w.r.t. 0 in w of /(.r ] 9)j 

mid because of the restriction on to and the eontininty of/, we can see that the 
l.u.h. of fix I 9) with respect to all 9 in w must coincide with the l.ii.b. of the 
function with respect to all 9 in Si, wliieli is the total parameder space. Thu.s 
we have that the hypothesis to is rejected whim 

Cijl.u.b. w.r t. 0 in (0 of J(x \ 0)1 < Ci[l.n.b. w'.r.t. 0 in Si of/(.r | 9)j 

or w'hen 

l.u.h w.r.t. 0 in <0 of f{x 1 0) Ci 
l.u.b! i^'.r.t. 9 inli off{x \ 9) Ct' 

Tdie left hand member of this inciiuality i.s tlie likelihood ratio statislie intro¬ 
duced by Neyman and Pcaraon [3]; hence our test is e.xaelly oipiivalent with the 
likelihood ratio tc.st where the .size of the critical ri'gion is determined by Cj 
and Ci. 

We po.se the following ipiite hypothetical exainph' to show cireumstaiices 
under which the principle proposed is rciusonable. Tin* jiriindplo does not 
exactly apply os it ivas staled in terms of probaliility dciisiti(*s and tlu' example 
iiu'olves discrete prohahilitios, Init the logic si'cms somewhat, appliealile. Sup¬ 
pose a game is played w'hieli consists of the player’s guessing the miiuher of white 
lialls in an urn known to contain 10 laills, each of which Is either white or black, 

oil the basis of a sample of four drawings wdtli replacements from the urn, Let 

us assume that there arc ckwon mutually exclusive hypothcscH (as to tlie iiumlioi 
of white balls in the uni) to ehoo,sc among, aiul the player must make a choice 
of one of them after observing the drawing which can give 10 tliffcrent results. 
Wssume that the one wdio plays the game pays a banker u varying sum ol money 
if he makes a wrong decision and that the banker him the privilege of choo,sing 



STATISTICAL HYPOTHIiSliS 


225 


the population (i e the number of white and black balls originally in the urn). 
Now on the basis of the assumption that the banker knows the player’s decision 
function and will attempt to fix the population so as to make the player’s ex¬ 
pected loss a maximum, it is clear that Wald’s principle, which minimizes the 
maximum loss, leads to the best way to play the game. 

Now suppose that instead of one player making the choice among the deci¬ 
sions, we have lli players participating in the game and the first player is to 
make the choice if, and only if, the drawing is WWWW, the second player if the 
drawing is WWWB, and so on, where W stands for the drawing of a white hall 
and B for the drawing of a black one. In this case, if player x assumes that the 
banker mil try to choose the population most unfavorable to him, then his 
decision function based on the new principle is the best method of play. 

Although the example indicates that in the usual case which would come up 
ui practice, Wald’s principle would lead to the better proceduie, since the 
statistician is usually faced with the necessity of giving a decision no matter 
what the sample point is, the new principle is useful since one may hope that in 
many practical casi^s the two principles will not lead to widely varying results, 
espe.cially if the sample is large. 

3- Application of the criterion to the case of testing the mean of a normal 
distribution. Now we will show that the cntcrion will lead to the widely used 
test of “Student’s hypothesis.” Suppose x is known to be distributed normally 
with unknowm mean a uiid unknoivn variance or* On the basis of a sample of N 
independent observations xj , xs, ,xs, “Student’s V' is used to test the hy¬ 
pothesis p = 0. If IS the aiithmetic mean of the JV ol^rvations and s* the 
usual sample estimate ot the variance, then with I = V N x/s, the hypothesis 
IS to be rejected if 11 j ^ <o where to is a critical value at some chosen level of 
significance a obtained from the distribution of i under the null hypothesis. We 
will use the notation w, for the set of points n ^ and wa for the set of points 
M = 0. 

We will consider the, |iioblem in reference to thi' particular weight function 
defined as follows: 

W(p, (TjOia) = (p/<r)* for p 0 
17(0, <r;ali) = W 

]F(p, v; toi) =0 for p 0 

17(0, V, Ola) = 0 

wlie.ui as a matter ot convenience, we will take k an even positive integer in ordev 
to avoid the introduction of the absolute, value of p/o- which is necessary if h 
IS an odd integer. We also take h ^ N. 

ffhe density function of the .sample of N observations is 

y '6 



226 


RALPH J, BROOKNKH 


where (7 is a constant. Then the two functions 2i,(i9, E) me 


Ri(.0, S) = ^ if M = 0 

/r 


Ri{e, E) = 0 


if u 0 


Ri(0, E) = if M 0 

2i:2(«, js) « 0 if M = 0. 

To maximize Ri{d, E), we set 

dRi(d, E) _ P—iV'Tf'’ >P6'a;a*l p ~a/2T’)sti _ « 

da- L J 


which gives 


hence 


■2 

<r = 


sxi 

N 


p (J,, _ GW- 


To maximize R 2 ( 6 , E), we set 


„ .C 


and 


dUzi,^ 


^(£j_®) _ r_ ^ _ fc + — GM 

5<r 0^ J «r—+**■' 


which give the two relations 


and 


l'’hen 


or 


= -f^S{x^ - m ) 




-M(iV + A;).S(ai„ - m) = kSix„ - m)'^ 


- A 2 .c(l - fc/-^V) - (/c/JV“)-S.r® =■ 1) 
which gives the maiximizing value of 


M* = 


_ «(1 - h/N) ± V^i - k/Ny+ (lk/N^)Sxl 

2 



.STATISTICAL HYPOTHESES 


227 


rind it can easily be shown that the maximum is reached for the value of 
using the + sign when x is positive and the — sign when S is negative. We will 
carry throug*h the rase a- > 0 only as the case x < 0 follow,s in a similar manner 
We have 






To find the region of the sample space for which we should accept the hy¬ 
pothesis n 9 ^ 0 (i.e. the critical region for rejection of the hypothe.sis /i = 0), we 
seek those points E for which Ri{E) ^ Ri{E), i e, those for which 


or for which 




where c is a positive constant. Since both sides of the inequality are positive, 
this inequality is equivalent to 


( 1 ) 


{Sx\)” 


where cj is another positive constant. 
Now we consider the statistic 


, _ _J2_ m _ N 

^ ~ N - 1 ~ Sxl - Nx^ ~ Sx\/x^-N 
from which wo have 

Sxl/x^= {N/T^) -f- N. 

Also note that 


2{n*/x) = (1 - k/N) + ^/(l - k/Ny+{4:k/N^){Sxl/x^) 

(and this is true w'hethei x is positive or negative). Now we can write the criti' 
cal region (1) as 

in*/x)^-'’(ji*/x - 1 )'^+*^ 

{Sxl/^y ^ ~ ^ 


or 

[1 - k/N + Vi/T^Y~’‘[i + VT'y 

■ [-i- k/N + ^ C2 

where C 2 is another positive constant. We denote the left side of this inequality 
by and it can be shown that is a monotone decreasing function of T®. 
Thus since the critical region is defined by the relation $(T“) ^ con.stant and 



228 


H.VH»U J. BHOOKNKU 


the critical region using “Student’s f” is ^ Cfuiatnnt, those procedures are 
exactly equivalent 

4. A Problem in statistical decisions. The (luestion w hieh ai oused l lie interest 
of the wiitcr in statistical decisions is the following one ol multivariate statistical 
analysis. Suppose .Ci, .ra, • ■ , Xp are known to he normally distiilnitod with 
unknown means and unknown variances and covarianci's, and on the basis of 
a set of N independent observations, a test is to bo made, of the. hypothesis 
E(xi) — EiXi) = • = E{xp) = 0 Such a test may be carried out by using 

the generalized Student Ratio [-1], and the hypothesis is either to he rejected or 
accepted as a whole. But considor the case in which the null hypothesis is 
rejected; it seems quite natural to ask for a more enlightening statement. Is it 
not possible to say that on the hiusis of the sample, the hypothesis should he 
rejected for ai.i, a:,j, • , hut not rejected for , • • • , Thus 

we seek a division of the sample space into 2’’ mutually exclusive regions, each 
oi which will lead us to reject the hypothesis of zero exjiected values for a par¬ 
ticular set of the aii’s and to accept it for the remaining set. 

"We will consider a solution of the problem in the case that the. coi'arianeo 
matrix of the joint normal distiilmtion is known, and ivill motivate that solution 
by considering first, tlie case of two variables. 

Suppose that X and Y arc normally and independimtly di,strihutccl with un¬ 
known means, a and |3, and with unit vaiianccs. 'Phe joint probability density 
function is then of the form 

J{X, Y) - 

The set of hypotheses is given as follows: 

Hi is the hypothesis that a = 0 and = 0 

III is the hypothesis that a 0 and (9 = 0 

Hs is the hypothesis that a = 0 and /9 0 

Hi is the hypothesis that a 0 and /3 0. 

We have a sample of N independent pairs of observations (ft, , F,) where cr = 
], 2, N, then the density function in the 2W'dimen.sional sample space is 

We seek the set of legions Mi, Mi, Ms , Mi in the .sample space which are 
chosen such that if the sample point E falls in Mi, we accept the hypothesis //,■. 
Wc take the following as the values of the 1 or,sch if the wrong decision is reached. 

I, If ill i>s accepted, 

i) for any parameter jioint (a, p), the loss is a continuous function of 
(a“ -h p'^), say W{a -f j3“), which is zero for a = /3 = 0, is differentiable, 
strictly monotoiiically increasing, and posse.sses a finite maximum 
when multiplied by the normal density function. 



STATISTICAL HYPOTHESES 


229 


II. If Hi is accepted, 

i) for any parameter point (a, /3) except (0, 0), the loss is W(S^) where 
W is the same function as above, 

ii) the loss is Wi if the true parameter point is (0, 0). 

III. If Hs is accepted, 

i) for any parameter point (a, fi) except (0, 0), the loss is ir(a*) where 
W is the same function as above, 

ii) the loss is Wt if the true parameter point is (0, 0). 

IV. If Hi is accepted, 

i) the loss is Wi if the true parameter point is either (a, 0) for a ^ 0 , 
or (0, P) for 0 

ii) the loss is W 3 if the true parameter point is (0, 0) 

where Wi , Wi , and W3 are constants subject to some slight restrictions which 
will be pointed out later. 

The functions RiiO, E) are then the following: 


Ri{e, E) = -b p) 

for a p^ 9^ 0 

= 0 

for a = p = 0 

Riie, E) = W(p‘^)G(a, p) 

forpT^O 

- Tfif7(0, 0) 

for a = /9 = 0 

= 0 

for a 0, /9 = 0 

R^ie, E) = W(a^)(?(a, p) 

for a ^ 0 

= WiG(0, 0) 

for a = P = 0 

= 0 

for a — 0 , P 9^ 0 

Ri{B, E) = Fi.(?(a, 0) 

for a 9^ 0, P — 0 

= W2G(0, P) 

for a = 0, /3 5 ^ 0 

= WzGiO, 0) 

for a = ^ = 0 

= 0 

for ap 7^ 0 


where (?(«, /3) is the normal distribution function 

% and y being the sample means. It should be pointed out that the use of the 
distribution of the sample means instead of the joint distiibution of the observa¬ 
tions is justified since the .sample means are sufficient statistics for the parameters 
a and p. 

We will use the notation TiiiE) to denote the maximum of Riid, E) with respect 
to a and |0, and it can easily be .seen to be the maximum of two expressions which 
we will denote by 11(1) and 11(2) where 11(1) is the maximum of W{p^) 0 {,ct, p) 
and 11(2) is the maximum of Wi(r(0, 0). Similarly, RaiE) is the maximum of 
III(l) and III(2), and Ri{,E) is the maximum of IV(1), IV(2), and IV(3), where 
these are the maxima of the two expressions involved in RziO, E), and the three 
expressions m Ri{d, E), respectively. 

We will first show that the function Ry{E) is a monotomc increasing function 
of {o? -|- y^) We know that the maximum of I2i(0, E') is reached for values of 



230 


HALPH J. J1ROOK.N13H 


a and /3 for which the partial derivatives of Ri{6, B) with u-spect to a. and i? 
are zero, i.e., for which 

[iV(a; ~ «)W(a- + /3=) + 2aTP(«' + ;9')]G!(«, p) = 0 

and 

[N{y - P)W{<x + 0^) + 2)9Tf'(a" + ;8^)]f?(a, P) = 0 

where is the derivative of W{ci + f) witli resirect to {ct -f- ^). 

Since G{a, /3) 0 , and W'{ct -j- /3^) 3 ^ 0, these relations imply 


or /3a: = ay. Thus the maximum of the function Ri{6, E) occurs for values of 
a and /3 which satisfy the relation a = (x/y)fi. 

Consider any two straight lines a — (x'/y')^ and a = ix"/y")^, and the 
values of the function Rdd, E) along these two lines. Obviously the values of 
the first factor WioL + /3^) are equal for points along the lines equidistant from 
the origin. Also, it tho values of a:', y', x”, and y" are such that a;'^ + y'"^ = 
+ J/"*) the values of the function G(a, 0) along both lines are equal for points 
equidistant from the origin, and it follows that y') = Ri{x", y"). Thu.s 
we have that iJi(Zi') is a function of (x* + y^). 

Note that if the value of x"^ + y"^ is greater than tho value of a:'“ + 2 /'“, tho 
curve representing the function G{a, 0) along a — {x"/y")0 is the same as that 
ilong tho line a = {x'/y')^, but it is shifted further from the origin. The values 
of W{oL + 0) are independent of x and y and the function is monotonic in 
q ,2 _|_ Thus, the value of Gia, 0) for which R\{0, E) is a maximum on a = 
ix'/y')p multiplies a larger value of WCa® + 0) than on a = {x"/y")0 so the 
maximum when x"'‘ + y"^ exceeds x'^ is the greater. But this proves that 
RiiE) is monotonically increasing in (a:’' + 2/*). 

In a similar manner, we now jiroceed to show that 11(1) is a monotonically 
increasing function of j/l We know that a necessary condition for a maximum 
of 11(1) is that 

an(i) _ 311(1) ^ 

da dp 

The first of these tivo relations is 

Wi0')N{x - a)G{a, /3) = 0 

which has the solutions W{0^) = 0 and a = x. But W(/3“) = 0 only for |9 = 0 
and this value is a minimum of 11(1), hence we have that the maximum is reached 
for a = x, so 

11(1) = max. of 

But along any two lines a = constant in the (a, ^)-plane, the function TF(/3") 
has identical monotonically increasing values in and the normal density 



[iTATIBTIGAL HYPOTHESES 


231 


function is identical along two such lines for a fixed value of y^. An increase in 
the value of displaces the normal function from the origin but does not affect 
its shape, hence the value of the normal density function at which 11(1) takes on 
its maximum is multiplied by a greater value of TTOS^) when is increased, so 
11(1) is monotonically increasing in i/*. In exactly the same manner, we find 
that ni(l) is a monotonically increasing function of a:“. 

Because the remaining functions are identical with the functions considered 
in the special case above, we have that 

11 ( 2 ) = 

III (2) = 

IV(1) = 

IV(2) = 

IV(3) = 

Now it is apparent that Ti\{E) is never less than 11(1) since 
W{c? + /3“)(?(a, /3) ^ W{§^)G{cc, ^) 

(the equality holds only for a = 0) and since a function which is never less than 
a second function cannot have a maximum less than the maximum of the second 
function. Also Ri{E) for the same reason is never less than III(l). Thus R\{E) 
can be the minimum of the four functions Ri{E) at most when Ri{E) is defined 
by 11(2) and Ri{E) is defined by III(2). 

Since 11(2) and 111(2) are the same monotonio decreasing function of (a:* -4- 
/) and since Ri{E) is a monotonic increasing function of (a;“ + j/“), there is a 
value r\ of (»’ + /) such that Ri{E) < 11(2) when and only when ^ < r\ . 

But for all values (s', y) wo have that Ri{,E) ^ 11(1) and Ri{E) S III(l), hence 
for all values within the circle = ?-o we have that 

(2) 11(1) ^ fti(E) < 11(2) 
and 

(3) iii(i) g Rm < ni(2) 

so it follows that R^iE) is defined by 11(2) and R3{E) is defined by III(2) within 
the circle. 

We restrict the values of Wi, Wi , and Ws used in the definitions of the weight 
functions to be Tl'^j ^ Il'a ^ > hence for all values of {x, y) 

and 

so Ri(E) is at least as gi'eat as 11(2) ovcr_the whole plane; hence, in light of 
relation (2), Ri(,E) is at least as great as Ri{E) for ^ rS . Therefore, 

since (2) shows that RiiE) < R^iE) within the circle; (3) shows that Ri{E) < 



232 


HALPH J, HHOOKNKIl 


RiiB) within the circle; juid nince ijuite obviouKl 5 ' the it'lations do not hold 
outside, the circle, we have that Mi is the net of points 

9.1 

X + y < r5. 

To deteiinine tlio region Mj , we, must determine those* points outside Mi for 
which Ri{E) < Riili) and R^iE) < Ri(.E). Consider fii-st the part of the plane 
outside, Ml for which Ri{E) is defined by 11(2). This is the region for which 
IT(2) > II(1). Consider the eurve. in the plane defined by IT(2) = 11(1), that is, 

UhCfi- = 11(1). 

We take differentials and have 

+ ydy] = 2ymiil)/d{y^)]dy 

hut this bhowa that dy/dx has the opposite sign from ?// r .since dllil)/d(y'^) is 
always positive. Also note that (or x = 0, the eciviation Ri{E) = 11(2) is identi¬ 
cal with the equation 11(1) = 11(2), .so for x = 0, we. have 11(1) > 11(2) when 
I y I > ro and 11(1) < 11(2) when | y | < ?o. Furthermoie, the. curve 11(1) = 
11(2) crosses the x axis at a finite value, of a;, .since for y — (1, 11 (1) is a constant 
while 11(2) is a decreasing function of x. 

We will refer to the various regions in the. first (juadrant of the (*, 2/)-planc 
shown in Figure I as follows: A is the. part of the quadrant which is Mi ; A, B, 
B', and (^are the regions in which RAE!) is defined liy 11(2), that, is, in whicli 
11(2) > 11(1); and in the same manner, A , B, li', and <" are the. regions in which 
Ra{E) is defined by III(2), 

Since 11(2) and III(2) are identical, we sec that, witliln (he regions B and B', 
R 2 {E) = Ra(,E) sine.e in the.se, regions R 2 iE) is defined by 11(2) and RAE) is 
defined by III(2). We have previously pointed out that 11(2) is never greater 
than Ri(E), hence it is clear that, B and B' should Ix'long t.o either Ms or Mu, 
and we will arbitrarily decide that B is part of M^ and B' part of Mn . 

Consider then the region Cjhero Ri{E) is defined by 11(21 and RAE^) by 111(1), 
.so within C 

11(2) = III(2) < 111(1) = RAE) 

and again 11(2) £ Ri{E), so tfie region C is pari, of Ms. ]1\' llw* same argument, 
we have that C is a part of Ms .since, within C 

III(2) = 11(2) < 11(1) = RAE) 

and 111(2) g RAE). 

Now ponsidcr the remainder of tlie. (luadrant outside .1, B, B', (!, and f*'. 
Here R^iE) is defined by 11(1) and RAE) is defined by 111(1). Sinec. II(I) is 
the same monotone increasing function of y' as 111(1) is of x", m\ have 11(1) > 
III(l) lor I y I > I .r I and 11(1) < III(l) for | a; | > ] y | . Thus we sec that in 
the region under discus,sion, RiiJi) is a minimum at mosi in the rcgion,s D and 
E and RAE) a minimum at mo.st in D' and E'. 



STATIRTICAL HYPOTHESES 


233 



In ordei' to detei'nnne then, that part of D and E which belongs to , wo 
seek the region for which 

11(1) < IV(1) when fiiCE) IS defined by IV(1) 

11(1) < IV(2) when MiiB) is defined by IV(2) 

11(1) < IV(S) when Ri(E) is defined by IV(3). 

But witliin D and E wc have that so it follows that IV(1) > IV(2) so 

RiiE) is never defined by IV(2) in D or E. Hence wc need determine the points 
which satisfy the. first, and third of these, relations. Now it is clear that the 
relation 11(1) < IV(]) is equivalent to the relation | y 1 < ijo for some value 
2/0 since 11(1) is monotonically increasing in y* and TV(1) is raonotonically de¬ 
creasing in y'. Lot y = ho. the line dividing D and E. 

"We impose a restriction on TVj such that D is part of Mi and E is part of Mi . 



234 


HA.IiPH J. HJIOOKNKU 


This restriction is that within E, IV{3) ^ 1V(1), note that since we are con- 
oerned only with 17/1 < | a: | , this imposes the gieatest restriction on TFa when 
X = y = 1 / 0 , so we are requiring that 


01 

It is simple to see that liecause of symmetry with respect to both axes and the 
origin, is defined by + if > rl and 1 7 / ] < | a; [ and 1 1 /1 < 1/0 ; Ms by 

> rj and 1 a; | < | [ and [ x | < a;o; and Mi by f -\- > rl and 12/1 > 

7/0 and I X I > Xo. It should be pointed out that Xo = t/o ■ 

We now consider the general case with a known covariance matrix. Con¬ 

sider the joint normally distributed variates Xf , X 2 , • • ■ , Xp whose covari¬ 
ance matrix is || ct/1| {i, j = 1, 2, , p), where the a„’s arc all known and 

whore || o-J', H is positive definite. The mean values of the X, ’s are Pi, ^ 2 , ■ • ■ , 
Pp which arc unknown. It is simple to sec that wo can consider now variates 
X{ = whose moan values arc «,• = /S./V^ and whose covariance 

matrix is || || wheie cr., = 1, If a sample of N independent observation,s on 

the XVs are given, ve have immediately the observations on the Z,’s, and we 
denote the sample means of the X.’s by Xi, X 2 , • • • , Xp, lespectivcly. 

There are 2” hypotheses among which we wish to ehoo.so; as notation, we let 

Hd lie ai = «a = • • • ~ a,, = 0 

Hi be ai 5 ^ 0 , 02 = 013 = ■ • = ap = 0 

II 2 lie 02 7 ^ 0 , ai = as = • • • — Up = 0 

Hn lie aia 2 0 , as = 04 == • • = ap = 0 

etc. As a further abbreviation, let 11^ denote any one of the p hypotheses Hi, 
Hi, ■ • ■ , Hp ] let if denote any of the (?) hypotheses Hu , Hu , ■ ■ ■ ; if denote 
any of the ( 3 ) hypotheses Hus, Hm ,•••', etc. Also let Mqjj .,, 1 ^ be the region 
of the sample space for which we accept the hypothesis Hnij-.,*, and let 
Kipj. ,fe, E) — 17(0, 71^,2 ..,k)f(E 1 0) be the risk density function if the hy¬ 
pothesis 77 , 1.1 is chosen, where we have used the notation 0 to represent the 
pai’ameter point ai, ao , 

We will also adopt 1 he following notations; in referring to the parameter point 
(ai, 02 , ■ ■ ■ , ap), we will write (ii, it, ,h) = 0 to mean all points for 
which a,i = a,j = • • ■ == a.^ = 0 and («^i)(aij) • • • (ay,) 0 where ii, is , 

• “ ir., ji, ji, ■ ■ ■ , ji are a jierinufcation of the integers 1,2, • • ■ , p. Further¬ 
more, we will write [ji , js, ■ • , j,] Oto mean (ii, is, • ■ • , ik) = 0 . 

By Q we denote the covarianoo matrix of the X,'n and by L its inverse; we ivill 
denote the elements of L by Xy,. By -we denote the matrix obtained by 

striking out rows 7i ■ ,H and columns ii , fa, • • , 4 from Q-, by 

wc denote the invense of the matrix Q’*'-'"’*, and we will write the elements of 



statistical hypotheses 


235 


"** as . Thus we can write the 

sample means a;i , 3 : 2 , • ■ ■ , as 


joint distribution of the set of 


(4) 


^g-lASSX( j{ij—a,)(xj—ay) 


Concerning the definition of the weight function, we will assume thefollowing: 

I. If IIo is accepted, 

i) the loss is IF’( 22 X,,'a,£tj) if the true parameter point is (ai, 
“ 2 ) • ■ 1 «p)) where W is a continuous, strictly monotonic increas¬ 
ing function whose value ia zero if (1, 2, ■ ■ • , p) = 0. The func¬ 
tion is restricted to mcrease slowly enough that the product of it 
and the denisity function (4) has a finite maximum with respect to 
the oil’s 

II. If is accepted, 

i) consider in particular Ha , then for all parameter points except 
(1, 2, ■ ■ , p) = 0, the value of the loss ia ITfSSXyj-Qi.Q!,), where W 
is the function defined above. 

ii) the loss is W* if the true parameter point is (1, 2, • • , p) = 0. 

III. If is accepted, 

i) consider in particular Hab, then for all parametei points except 

( 1 , 2 , • ■ ■ , p) == 0 and [a] 0 and [5] 7 ^ 0 , the loss k fTfSSXtJo'jay), 

where W is the function defined above, 

ii) the loss ia Wl if the true parameter point is either [a] 7 ^ 0 or [b] 7 ^ 0 , 
where Wl ^ Wl, 

iii) the loss is if the true parameter point is ( 1 , 2 , ■ ■ • , p) = 0 wheie 
Wl g Wl. 

In general; if II’’ is accepted, 

i) consider in particular ...^, then for all paiameter points except. 

( 1 , 2 , ■ • • , p) = 0 , [ill 7 ^ 0 , M 5 ^ 0 , • • • , [fi, 12 ] 7 ^ 0 , \ii , la] 7 ^ 0 , ■ • • , 

etc., the loss is W^( 22 X]}*’" **ayci£y), 

ii) the loss is (r = 1, 2, • • • , fc — 1) if [ly, , i,i, ■ ■ ■ , iyj 0, where 

, • • • , 3r are r different positive integers less than or equal to k. 

Also Fti ^WU^ ■■■ -^wl, wlzX ^ WU , WU g Wt.\ ^ WU , 

etc. 

iii) the loss is W\ if ( 1 , 2 , • • ■ , p) = 0 , where W\ ^ W\ , 

where the Wj are constants subject to some further slight restrictions which wo 
will impose later. The 22 has been used throughout to denote summation ovei 
all values which i and j take on in 

We consider first the risk density function corresponding to !h , that is 

To maximize B), we have the set of p equations obtained by setting the 
p partials of E) with respect to the equal to zero, which arc necessary 
conditions. We have 


dRo{d, E) 
da. 


[W2X.,<a)y- 


— J A 2 2X , y (7,-a,) (I, -oy) 



236 


TIAH'H J. UROOKNKIl 


SO the necessary conditions are 

^ + [mK(x, - a,)]W =0 (i ^ 1,2, ■ ■,p) 

da. 

This can also he written 

(22X.,aj)D,Tf(z) + Wiz)N2-Xij(.Xj - ay) = 0 

where we have set z = SZXua.aj and rvheie we nsti the notation D, to indicate 
differentiation with respect to a. Fix i at two particular values, say a and h; 
then two of the equations of this system can be written 

(2I,\aja,)D:W{z) + W{z)m-X.,{x^ - a,) = 0 

(22X6jay)D.lf(z) + lT(z)JV2Xfrj(.i:j - a,) = 0 

that is 

(2Xnyaj-)[2X|,j(a;y a^)] = (ZXi;^;) [ 2 X 93 ( 1 ', — ay)] 

or 

(2X9,«,)(2Xi,3ajy) = ( 2 x 63 a,)( 2 X„,-X 3 ). 

'riiis we can write as 

hljX,,jTk “ 22 X 66 X 9 ya^ni, 
or 

22XoyX6fc(a,r6 — arJiy) = 0 

Giving a and h the combinations of values which arc possible, this is a set of 
linear homogeneous equations in the p" unknowns {a,Xk — mx,) which has the 
obvious solution UjXk — = 0 or a,it = auxj. 

Thus we have that the maximum of the function lio{d, E) is reached for a set 
of values of the a,’s which lie on the straight line 

(5) at = {x,fxC)ai . 

The function RoiE), which is the raaxirnuni of Ro{0, E) with respect to the 
a,’s is a monotonically increasing function of (22X.ja',i:j), which we show in the 
following manner. Because of (5), we see that 

2SX,,(a;( - a,)(ij — «,) = 22 X;jIi( - (.T,/a'i)ai][.i;3 — {:Xj/x^a{\ 

= 22X,-ja;,iy[l - (a,/.r,i)]‘. 

Also, 

22X,,a,a, — 22X,'ji,\ry(ai/;i;i)'. 

lienee we see that Ra{E) is the maximum with respect to cj of 



STATISTICAL IIYPOTHESES 


237 


HO tor two sample points E' = (x[ , x'l, ■ , Xp) and E” .= (%', x^, ■ • , a;”) 

such that I,'2,'K,jx\x, = SISX.jSia:", it is clear that 7^0(7?') = ^o(£^");thus R^iE) 
is a function of 2SX,ja;,a;^. 

But then without loss of generality, we can consider R^{E) along the xl axis, 
i.e for Xs = X 3 =■■■ — Xp = 0. Using relation (5), we see that this implies 
that the maximizing parameter values are ^2 = aa = ■ • = Up = 0. But then 

FME) = max. of 

« I 

which we have pu'vioualy shown is a monotonic increasing function of xl. 
Therefore Rd{E) is a monotonic incicasing function of SSX,';X,ij. 

We will furthermore show that the maximum of each risk density function 
corresponding to iiartK i) as given in the weight functions are monotonically 
increasing functions of certain quadratic forms in the a:,. Consider for example 
the function corre,sponding to part i) of Ri(d, E), that is 

We will wiite the maxinuim of this function with respect to the a:»’s as Ri(i)- 
Note that the weight function is not a function of m , hence the partial derivative 
of (6) with ro.sjx'ct to ai sot equal to zero is equivalent to 

- ay) = 0. 

Squaring this relation and multiplying by W/2Xu gives 

(iV/2Xu)S2)Xi.Xiy(x. - ad(x, - ay) = 0 
HO we can write the <*xponent in (6) 

Exp. = — (jV/2Xii)SS(XnXfy — Xi<Xi;)(a;< — ay)(a;y — ay). 

Because of the delinition of X,y, if we write o>,j for the cofactor of iry, in 1 o-.y j, 
we liave 

Exp. — [iV^2Xii(j Vij I) coi,’OJij)(x, ai)([r^ ■ ay). 

But by a well known algebraic identity*’, 

wiitiJti ~ coiiOJiy = I (Tjj 1 • [cofactor of (o-mriy — cuxi]) m ( viy |] 

— I (Tij j 'C0,y 

where we have written wiy to be. the cofactor ol a-i, in ] a], ) , so 
Exp. = — (iV/2Xa| a.yDSSwijU', — a,)( 2 ;y — ay). 

But Xii I ffyy I = Wii = I vly I , he.ncc 

Exp. = —SSXl-yGTi - a,)('C, — aj). 

A 

Thewifore 

Rid) = max. ot lf(SSXlya,ay)C'fi"'^’’“’'J''''‘'“'’^^’'"’’ 

til I 

^ See M Bocher, Introihiclion to Higher Algebra 



238 


RALPH J. BROOKNEU 


But then it follows in exactly the same way as with RaiE) that Ri(i) is a 
monotonically increasing function of . For tho other functions 

corresponding to other hypotheses the argument is identical, and for risk 
density functions corresponding to hypotheses with more than one «,• ?■£ 0, 
the same argument is repeated two or more times in succession to give the result. 
We will show that for any value of tho parameters at, ai, ■ ■ ■ , ap the relation 

^ S2:Xo-a.«j 

holds. This ielation is true if the relation 

(7) SS[(6).,/I <r,y I) — (fjl,/I a-<j |)]a.ay ^ 0 

is true where we define = 0. That is, if 

(1/I Vii II (Tij DSHwoiaii — wij I (T.’y |)a,aj ^ 0 
where we have substituted wu for its equal | vi,-1 . But note that 
Uij = cofactor of (o-iio-.-j — viiCij) in l£r,,-| 
hence by the identity quoted (see footnote 2) 

1 I *= «nw,j — wi.wjj 

so the left hand member of relation (7) is 

(1/1 (Tij II <i\j l)22:(w„t<)ii — wnWij + Wi,wiy)a,aj 

= (1/1 II (r\, |)2Swi, 0 ) 1 ,■oijay 

= [2au,a,]V(| v<j 1| <r</l) 

^ 0 

since all matrices here are symmetric and positive definite. Note that the 
argument can be repeated one or more times to show 

W(2SX„«.a^) ^ lF(22X|j'^’'''*a.ay) 
or 

W(22X(V^ ’•«,«,) S TF(2EX;}’ 

where iih, • • • , ik are any set of k different integers les,s than or equal to p, 
and jiji • ■ ,j, are any subset of iiH •• • ,iic. 

Consider the maximum of the expressions 

We know that (p — ?•) of the aj’s in these expressions aie aciro and by an argu¬ 
ment similar to that given above*, it is clear that if the r aj’s not equal to zero 
are , • • ■ , , then the maximum of the expressions is given by 


’ See p. 36. 



STATISTIC HYPOTHESIiS 


239 


.Vlso for r = 0, the maximum is oliviously 


Recall that wa lun^e reHtricted tlie TFj’s so that 

(8) TFj ^ IFo" ^ g TF^’ and IF^i g ■ ■ ■ ^ fVS . 
From a ])revious calculation, it follows that 

(9) SSX./.'c,;?, ^ ^ }i;iiX;V'h'r.x,' ^ • ■ 


We can then quite easily calculate the region Mo, that is, the region of the 
sample space for which Rq(E) is the minimum of all the i^(Eys. We 

have pointed out that 



iF(i;i:xo«,c^j) s w(ssx:;*^- 

SO it follows that 


(10) 

R,iE) ^ R,,„. .4(0 

tliat IS 

Ro(E) 5 RiiU ■n(E) 

so long as Ri^ii- 

n(E) is defined by Rqi,. .,^(^). 


From the relations (8) and (9), we have that 
(LI) ^ -^Ykg-iifxs^a^fxi 


for & = 2, 3, • • ■ , p. Now because 


IS a monotomc decreasing function of S2X„Xia:;, and because RoiE) is a mono- 
tonioally increasing function of 2SX,',-a;,x,, there is a value rl such that within 
the ellipse S2XoX,'Xj = rl , the relation 

( 12 ) Ro{E) < 


holds, and outside it the opposite inequality holds. But from relations (10) 
and (12), it follows that within this ellipse, no il.iij, <*(f?) except RaiE) can be 
defined by Then in view of relation (il) and since a quantity is 

certainly less than the maximum of several quantities if it is loss than one of 
lUosc several quantities, the region Mo is the set of points illCXjjXiX, < rl. 

Now consider the funclmns i?„(B) in the region onfaide Mo. Wc know that 
RaiE) = RaiO udion 


max. of ll'(ISi;x",'a, a;f)e 





240 


lUU’H J. IlHOOKNEK 


and we will write R,iE) = R^ii) when the opposite iiusiuality holds, Consider 
a part of the sample space outside j¥q in which 

if,,(75) = R.titi) 

R,,{E) = i?.,(xi) 


R„{E) = i?.,(u) 

wliere fc ^ 1, and where R,{,E) ^ if,(tt) for j h , i;, • • ■ , u , We see, in this 
case that R,^{E) = R^^{E) = •■■ = R,^(e) < RjiE), where aRain j ii, 
ii, ■ ■ ,ik. Furthermore, in this case, hecausi' of the lelation (11), we have 
that E should be a point of cither il/,, , Af,,, • • • or, M,^ . We will arbitrarily 
decide in this case that E should he a point of A/,, (s an integer ^ k) wheie 
i, is determined so that 

^ wSMJa-.Xj lor any / = 1,2, • • ■ , 

Now consider the region in which RriE) = Rr(i) for all ? = 1, 2, • • • , p 
We see that each R,{i) i,s the same monotonically ineruasing function of a quad¬ 
ratic form of the tyjjc . Hence in order 1liat. E bo a point, of a par¬ 

ticular Mr, it IS necessary that 

(13) :ZXK,x.,Xj g foi' all s 5^ 7 

Now let us consider a fixed rand compare R,(,i) with all /I'n,,,, .,i(7'i)’Hfoi’fc ^ I* 
We have pointed out that 

(14) 22:X<,a:,a;y ^ ’ Wi;„c, 

80 Rr(,i) ^ ifruij ■iA,('i) and hence Rrii) can be a minimum at most when all 
Rri^i' arc defined by other than ifr,,,2 •••i('i)- 

Consider then, any RriiiE) when defined by other than Rtii(^), that, is when 
RrtiiE) is equal to one of 

^ rir„{ii-i) (.say) 

Because of the relations (8) and (14), wc have that 

RrM g .,,(/?) 

whenever these arc defined by other than R„^{^) mid Rn-^vr-'^i.^'O- b'urlhcrrnorc 
in the region defined by (13), we see that Rmiii) ^ 7J,(,(fw), hence if,,, (E) is 
never defined by Rri^iiii) in this region. 

Now the relation Rrii) < lir,i(w) is easily seen to Ix' eciiiivalent to the relation 

(15) S'Sy.jx.x, < rj 




STATISTICAL HYPOTHESES 


241 


for some value ri. With the restriction on TTo that it be not so much larger 
than Wi that when (12) does not hold, R„^(E) is not defined by Rri^(w), we have 
that the region for which Rr{i) < is the region defined by (18) and 

( 16 ). 

We then restrict the relationship between the constants Wl and Wl to be 
such that for all points outside of Mo but within the region defined by (13) and 
(15), the relation ^ holds for ••• , jk each 

different from r. Note that this is not an unreasonable restriction since the right 
hand side of the relation is bounded above by rl, 2SX,ja;ja:j is bounded below by 
rl, and therefore, 22X(j'' is bounded below by some positive value 

where r* is a monotonically increasing function of ro 

Using a similar method, the region M,j,j. can be obtained after all regions 
-‘ri ^11 m < fc have been derived. If some fuither restrictions are 
imposed on the constants in the weight functions similar to those formulated 
in deriving the region Mr , it can be shown that the region M,i<j, ,j(fc S 1) 
will be given by the inequalities 

XX\ijXiXj S rl 

^ rl, for all m < k and all ji, •• , 3 ,,, 

S for all ji, ■ - Jk 

and 

22X{)'^ •■'’‘x,x^ < rl 

Thus we have rationalized the following solution of the question posed at the 
beginning of section 4. We test the hypothesis E{xi) = Eixz) = • ■ • = 
E{Xp) = 0 using the generalized Student ratio replacing the sample covariance 
matrix by the population covariance matrix since the latter is assumed to be 
known, at some chosen level of significance. If the hypothesis is not rejected, 
we make the decision (! 0 i responding to i?o. If the ratio is significant, we com¬ 
pute the. ratios T^, T\ ■ • ■ , T'’ where by definition is the generalized 

Student ratio computed for x,i, , • ■ ■ , x,^ ■ , ik , ji, J 2 , ■ ■ • , 

is a permutation of the mtegers 1, 2, • ■ • , p), the variates x^ , x,^, • ■ ■ , x,^ 
being ignored. 

We consider the smallest of the ratios computed on the basis of (p — 1) of 
the Xj’s; say it is l'\ Then if y is not significant at some level of significance 
(which need not be the same level as considered before), we make the decision 
corresponding to II, ; if 2" is significant, we compute all the ratios based on 
(p — 2) of the a:’H. If T''" is the smallest of these, wc make the decision cor¬ 
responding to Hr, if f’ is not significant but proceed to calculate the ratios baae<l 
on (p — 3) of the .t,’s it it is significant, and so on. 

6. Concluding remarks. It should be pointed out that while the derivation 
of the explicit ineqiialitie.s defining the various regions of acceptance may be 



242 


RALPH J. HROOKNBR 


ratter involved, for any given sample point C, it is relatively simple to determine 
tte region of acceptance to ■which this point E belongs. That is, we calculate 
the various values .(^{E) and choose the decision if 
is the minimum of the values of ^iE) for all values of fi, t2, > • •, . 
For making a decision on the basis of a given sample point E, it is not necessary 
to find explicit analytic formulas defining the shapes of the various regions of 
acceptance. 

Since the principle used here is proposed merely as a substitute for Wald's 

principle for the sake of mathematical simplification, it is felt that in certain 

problems Wald's principle may be used as a check on the results. For example, 

it is felt that the new principle is apt to lead to decision regions of the proper 

shape though the exact sizes of these regions may not be correct. In cases where 

the decision regions cannot be determined by Wald’s principle, it seems possible 

that a determination may be made in Wald’s sense among the various decision 

regions having the same shapes as those given by the new principle. In the 

case considered here, for example, it may be possible to determine new values of 
2 ! 2 

I should like to express my very groat appreciation to Professor H. Hotelling 
for many suggestions during the preparation of this paper and to Professor A. 
Wald for constant guidance. I should also like to credit Professor Helen Walker 
with originally posing the question that led to this research. 

REFERENCES 

[ll A. Wald, Annola of Math, StaL, Vol. 10 (1939), pp. 299-328. 

[2] A. Wald, On the Prind'pks of Stalishcal Inference, Notre Dame, Ind., 1942, 

[31 J. Nbtman and E. Peaeson, Traneachont of the Roj/al Society, L, Vol. 231 (1933), p. 296. 
[4] H, Hotbllino, Annals of Math. Slat., Vol. 2 (1031), pp. 300-378. 



A TWO-SAMPLE TEST FOR A LINEAR HYPOTHESIS WHOSE POWER 
IS INDEPENDENT OF THE VARIANCE 

By Chakles Stein 
Ashedlle, N. C. 

1 . Introduction. In a paper in the Annals of Mathematical Statistics, Dant- 
zig [1] proves that, for a sample of fixed size, there does not exist a test for Stu¬ 
dent’s hypothesis whose power is mdependent of the variance. Here, a two- 
samplo test with this property will be presented, the size of the second sample 
depending upon the result of the first. The problem of determining confidence 
intervals, of preassigned length and confidence coefficient, for the mean of a 
normal distiibution with unknown variance is solved by the same procedure. 
These considerations including the non-existence of a smgle-sample teat whose 
power is independent of the variance, are extended to the case of a linear hy¬ 
pothesis. In orUcr to make the power of a test or the length of a confidence 
interval exactly independent of the variance, it appears necessary to waste a 
small part of the information. Thus, in practical applications, one will not use 
a test with this property, but rather a test which is uniformly more powerful, or 
an interval of the same length, whose confidence coefficient is a function of <r, 
but always greater than the desired value, the difference usually being slight, at' 
the same time reducing the expected number of observations by a small amount. 

Any two sample procedure, such as that discussed in this paper, can be con¬ 
sidered a special case of sequential analysis developed by Wald [5], 

The problem of whether these tests and confidence intervals are in any sense 
optimum is unsolved. It is difficult even to formulate a definition of an optimum 
among sequential tests of a hypothesis against multiple alternatives. However 
it is shown that, if the variance and mitial sample size are sufficiently large, the 
expected number of observations differs only slightly from the number of ob¬ 
servations required for a single-sample test when the variance is known. It also 
seems likely that the confidence intervals do possess some optimum property 
among the class of all two-sample procedures. 

Although Student’s hypothesis is a special case of a linear hypothesis, it is 
treated separately, because it illustrates the basic idea without any complicated 
notation or new distributions. The test for Student’s hypothesis involves the 
use only of Student’s distribution, even for the power of the test, while the power 
function of the test proposed here for a linear hypothesis involves a new type of 
non-central F-distribution. 

The notation Xn is used as a generic symbol for a random variable equal to 
the sum of squares of n independently normally distributed random variables 
with mean 0 and variance 1, i.e., Xn has the % distribution with n degrees of 
freedom, 


243 



244 


CHAHIiES STEIN 


P{xl <T} = 


(V2)"r(^) 


/' 


du 


for r > 0 


= 0 


for r < 0. 


3J'V/ 71 

The notation is used as a generic symbol for - -- ■, wliere x is normally dis- 

X-v 

tributed -with mean 0 and Variance 1, independently of Xn, i.e.j In bas the dis¬ 
tribution of Student’s i with n degrees of freedom, 

-Un+I) 

ds. 


pu . _ miiL+M 


II + 0 


F„,n is a generic symbol for a random variable of the form F„,n = nxm/mxl,, 
the numerator and denominator being independently distributed, i.e., Fm.n has 
the distribution of an F-ratio with m and n degrees of freedom, 


< T} 


r(Kw + n)) 
r(^)r(^n) 




—JCm-t-n) 


dF. 


A symbol of the above type with an additional subscript a denotes the upper 
100 a% significance level, e.g., i„,a is defined by 


F{tn > in.«} = a. 


The symbol Fix j Q(x)j denotes the set of all a; such that the condition Q(a!) 
holds. This should not be confused with F(x | T), which denotes the expected 
value of a random variable x, given the conditions T. 

The size of a critical region is the probability that the sample point will lie 
within the region under the null hypothesis. The terms length and volume, as 
applied to confidence regions are used in the ordinary geometrical sense. 


2. The test for Student’s hypothesis. Suppose x,, i = 1, 2, • ■ • are inde¬ 
pendently normally distributed with mean f and variance <r^. We wish to test 
the hypothesis ? = ?o, the power of the test to depend only upon f — fo, not 
upon £r^ For this purpose we define a statistic t' a.s follows. A sample of no 
observations, xi • • • .t„j is taken, and the sample estimate, s*, of the variance 
computed by 



Then n is defined by 


( 2 ) 


n = max 



1, Wo "h I 


1 


where 0 is a previously specified positive constant, [g] denoting the smallest 
integer less than q. Additional observations, x^j+i, • • • , Xn are taken, and, in 



A TWO SAMPLE TEST 


245 


accordance with an initially specified rule depending only upon real numbers 
Oj, i = 1 • ■ ■ n are chosen in such a way that 


Z = 1, 


dl — (I2 — 


^ ^nn 


(3) 


^ 2-^ = i3. 

1 


This is clearly possible since 

(4) min Z a? = - < 1 by (2), 

1 n s‘ 

the miniraum being taken subject to the conditions 

n 

Z = 1) fli = 02 = • • • = a 
1 

Then i' is defined by 


no ■ 


(5) 


where 


( 6 ) 


t' = 


n 

Z a. 2. - fo Z “*(3:1 - ?) 






+ 


Ini" 

\/e 


, I ~ & 

““+“ 7 r’ 


u = 


Z MXi - f) 

■ 


Then u has the distribution of Student’s t with no — 1 degrees of fieedom, re¬ 
gardless of the value of (^^ For (no — l)s^/i/ has the distribution of Xno-i^>'nd 

1 " 

the conditional distribution of Z ®<(*» “ ^) = w, given s, is normal with 

mean 0 and variance <r°SoV« = <rVs^. But the usual form of a random variable 
is = y/s, y being normally distributed with mean 0 and variance ^, 
and (no — l)sV<r“ having the distribution of x«o-i ? independent of y. Thus the 
conditional distribution of it, given a, is normal with mean 0 and variance o■Vs^ 
so that and u have the same distribution. 

This theorem can be used to obtain an unbiased test for the hypothesis Ho 
that ? = f 0 , the power being independent of c”, which is supposed unknown. 
Let a be the desired size of the critical region and let tno-i.o/z be such that 


P{ino-l ^ inj-l.o/l) — 2’ 


( 7 ) 



246 


CHARLES STEIN 


Then if we reject Ho whenever 


( 8 ) 


^atXt - to 


^ ^ng—1,0/2 t 


we obtain an unbiased test of Ho, whose power function is 1 — |3(t) where 

(9) (3(t) = P|—< ino-l < tno-l.tt/l + 

The fact that the test is unbiased follows immediately from the symmetry and 
unimodality of the t distribution. 

If we wish to test the hypothesis Ho'.^ — {o against one-sided alternatives 
t > to, the procedure is similar. The critical region of siz® a is defined by 


( 10 ) 

and the power function is 


2 a. a:, — io 


V a 




( 11 ) 


1 - m = 


A confidence interval for of predetermined length I and confidence co¬ 
efficient 1 — a can bo obtained by selecting e so that 

\ — a = P {~:r~7= < tflg-i < 


' 2 V 1 


a 


zV 0 


= p 


I Z) ai(xi - 0 I 

[ 2Va'^ Va *^2y/z) 


( 12 ) 


= - I < Eo,a:< < f ^ 


= P 




= P^Ea,*, - A < f < 




where {is the true mean of the distribution. Thus {TiafiCi — 1/2, Zaat -|- 1/2) 
is the desired confidence interval. 

In the above tests and confidence intervals, the distribution of the required 
number of observations, n, is 



A TWO SAMPLE TEST 


247 


(13) = P{(Wo - < (no + 1)(?^ - l)0/ff } = P{x“„o-i < y] 

"" (V2)"”“‘r(i(n<, - D) I ® 

where y == {nl — 1) s/tr^ 


P|n = v] = P<[p<"' + 1<p4-1 
0 


(14) 


~ P((v — 1)(tIo — l)z/cr^ < Xno-l < pCno — 1 ) 0 / 0 -^} 


1 ^('(no—!)»/»> 

“ ^2)"“-‘r(K«o - if) Vi)(no-«./.> 


for mtogral p > no + 1, all other values being impossible. Thus the expected 
numbex' of observations, E{n), satisfies the inequalities 


(V2)"""^r(Kno - l)){i e 

< E(n) 

< IV W-^lHnc - 1 )) {i' 


which can be lewritten 

(no + l)P(xna-i < y} + ■“Pixno+i > y} 

0 

< E(n) < (no + l)P(Xno-l < yj + ~ P{Xno+l > y} + P{Xno-l > tf} ■ 


(16) 


Consequently E(n) is a function of and can be evaluated from tables of the 
incomplete r function. 

As mentioned in the introduction, these tests and confidence intervals will 
not be used exactly in this form, since they waste information in order to make 
the power of the test or the length of the confidence interval strictly independent 
of the variance. Instead of (2) we take a total of 


(17) 

observations, and define 


n max 


{[ 7 ]+ !'”»} 


t" 


( 18 ) 


8 


n T 




Vn + 


n 


= 14' + - - — '\ Jn . 



248 


CHARIjER stein 


By the same rea.scning as that following (6), u' has the t (llstribution with tin — 1 
degrees of freedom By (2) 


(19) 


n > s'/e so that, although 


f-Jo 
s 


■\^7l 


is a random variable, 

( 20 ) 

Thus, if wc use 


^ - ^0 
s 



> 


f — fo 

V 0 


( 21 ) 


\t'' \ > ^io-l.<i/2 I” 


instead of (8) or (10) rcspecth'ely, we shall always increase the power of the teat. 
Also the expected number of observations will be reduced from that m (16) by 
< 2/). Similarly if g is defined as in (12), the, interval 



has length I, and the probability that it covers the true mean f is a function of o-, 
but is always greater than 1 — «, and differs only slightly from 1 — a if o-^ > 
J?o 0 . Thus it can be irsed instead of the ctmlidence inten'al (12). 

From (16) it follow.s that 


lim 

»00 




< 1 


lim 



> 0 , 


the approximation A'(a) = cr'^/e being fair provided <t~ > grto . The length of 
the confidence interval (12) is given by 


7 _ Ov ^ r. ~ 2<r2no-l,a/2 

- 21..-.,.,. Vb - 

When the variance a is known, the length of the single-sample confidence 
interval of confidence coefficient 1 — a obtained on the basis of n observations 
is given by 


y /•tvn/: 


i.e.. 


-iVn/ii 

!• = 2t„,„/2v/\/n . 


c (Jx 


Since, even for moderate values of no, say no > 30, 1^-1,a/o differs only slightly 
irom i„,a/i, the expected number of observations for a confidence interval of 



•V TWO SAMPLI} TEST 


249 


fijiven length and confidence coefficient is only slightly larger than the fixed num¬ 
ber of observations rofiuned in the single-sample case when the variance is 
known provided the ^'ariance is moderately large. 

3. Distribution, of a non-central F-ratio. In the extention of the above 
consideiation.s to the testing of a general linear hypothesis, the power function 
depends on the distribution of a quantity 

(22) r = i: (2. - c.)^ 


where g, = a, lieing independently normally distributed with mean 0 and 
V r 

\ariance 1. and r having the xl distiibution, independently of the aq. The 
f, are real constants. 

Let 


(23) 


(24) 


7}^ W* 

X = Z = Z (s? - r 

1 1 

= Z (siv - Cv Vr) - - Vr y P cy ■ 


Now, ^ (a,' — c,f)^ is a quadratic form of rank m — 1 .since the t, — c,f are 
1 

sniiject to one linear homogeneous restriction, namely 2 ~ ^ 

1 

m 

Also f is of rank 1, and x'* + = Z that, by Cochrants Theorem, x‘‘ and 

1 

are independently distributed as xm-i and xi respectively. Thus there exist 
Vi • ' Urn , independently normally distributed with mean 0 and variance 1 
such that 


(25) 


= yl + 
r' = yl 


+ y. 


Lot M, = . Then the joint distribution of mi • • ■ Wm is given by 


P{Ui < ri , • • ■ , Um < T^i = 


(\/2)"r(|n) 


J(\ j—oo 00 


^w\/ r 




(26) 



250 


cirAnr^Bs stein 


The density function, is given bj 
SrP{ui < n, • • • , Um < Tml 


Bti ■■ ■ drm 


] 


(27) 


(V2ir)" (V2)"r(b) ./o r e , 

(v^r(V2rr(jn)i 


i—Uw+n) 


(‘+M_ r 

r("^n) Jo 


e' 


r(^(n + m)) 


Then let 
(28) 


(v;rr(w(‘ + ?’'0 


-iC'rt } 7i) 


v' = 


2/1 


/2 X 2 , I 2 

— Ul , T — „ — "1^2 "b ■ ■ + Urn . 


\/r "s/r '' ?* 

The joint distribution of tj' and t'" is thus, by (27), 

Plv' < V, t'^ < T*) 

r(J(m + n)) 


(V'^)"‘r(in) 


//■■■/(i+|;u:) 


-l(m I 111 


dill • • • du„ 


Ul<>Ii Su*<t2 


r(KOT + n) ) f r f f, , is- 

(VT)"'r(^n) JJ---J + 


•JC'ii I'0 r!("i n 


(29) UlOji Si/!<t“/(1+uJ) 

( >" A"!!” '■'*> 

1 + dUi dy^ < ■ • dy„ 

Viiim+_n)) II . I 


(v;)T(w 


«i<ii 2vi<’’*/<i+“i) 






dui dyt ■■ ■ dy ^,, 


In order to evaluate this integral, Ave uso the fact that the distribution of a ratio 
of Xm_i to Xn+i, the two licing indepeiulont, can lie ('^pressed in two forms, by 
(27) and Wilks [2], p. IM, 

p( 2 / 2 ^ ,1 _ r(K”^ + n) ) _ r'*' j(,„. i)_i. \~i(ml-nj . 

■PlXm-l/Xn+l < l/'j ~ _ lAAn/'i/*, _i lAA f ’f (t + ¥>) dy" 

JQ 


(30) 


Tim - i))r(K« + 1)) Jo 
r(i(OT + n)) 


(VT)"*“^r(Kn +1)) 


/ in-1 iCw^n) 

• ■ j U + S 7^ dqi ■■■ dcu, 


S 2 !<t 



A TWO SAMPLE TEST 


261 


PW < V, 


r(f (m + n)) 
V^rmrc^yn -1)) 


/ /’ 


(1 + d<p dui 


(31) ^ _ r(Kw + n)) 

~ \/irr(^ri)r(^(m - 1)) 


X f f (1 + (l + r^—(1 + < 

• u*—flo »fp»0 \ .1 i" U f 

- -r(Kffl + W))- r” r i(„_3), , 2 ^v-}(m+n) , 

- ViT(i«)r(i(m - IH i__ (1 + « + r) de iu. 


_ r(Kffl + n)) 
V'^r(^)r(|(m -1)) 


‘^1*1—« Jf=0 


Now we wish to find the distribution of 

F' = jz {U ~ c.)^ 

1 

^ I (r - VfV^.)^ 

r r r 

= r'^ + („' - VMT- 

Carrying out the transformation (32), it is found that the joint density function 
of ij' and F' is 

pW, F') dr,' dF' 

- V?mn)r(Km-l))t^ 


X [1 + V' + - (V - V2c?)T‘‘”‘''"’ dr,' dF' 


r(§(TO + n)) 


VrrV(_^n)Ti^{m - 1 )) 


[F' — 

X [1 + 4- 2pV^ + dp dF', 


where p = r,' — -s/Xci ■ In order to obtain the distribution of F' we must inte¬ 
grate out p over —■\/F < p < ^/F, obtaining 

P{F' <T] = ^,nA'F> 2c?) 

^ r(|(w + n)) _ 

(34) ~ V'xr(§n)r(i(7n ~ 1)) 

X r (^_ [F' - p1*'”‘~’’[l +F' + 2p + 20?]-^'”+"’ dp dF'. 

Jf '^0 Jp=.—\/r' 

In the case Sc? = 0, (34) reduces to the distribution of the ratio Xm/xn ■ 



262 


CHAllliES STEIN 


4. Test of a linear hypothesis. In this case the p(n\er of the test usually 
employed is affected not only by the variance, but also by the values of tho pre¬ 
dictors. In order to avoid this difficulty, it will be a.s.sunicd that only a piedo- 
tormined number of different sct.s of predictors aie used, and that these sets are 
repeated as a whole, as many tune.s as is necessary. This covers, in particular, 
the replication of orthogonal design-s for the analy.sis of variance. 

Lot 1 /,,', ^ = ] • m, j = 1, 2, - be independently normally di.slributed 

with mean.s 


li 

(35) Ey„ = 2 ai,a:A., ju < I'H-nk {Xkt) = y, 

and variance o-“, the. a;*., being given in advance, o-* and O/, unknown. We wish 
to test llo: 22 cna*. = Cw , I = 1 ■ ■ r < y, where we may suppose eciuation.s 

i-i 

(36) linearly independent, the ca being given con.stants. It will bo convenient 
to reduce this to a canonical form, as in Tang [3] I'hrst, by a non-singular 
linear transformation 

(37) xk, = 22 


we can make. 


(38) 



0;ii) — I) 


S) 


the M X M idcut'ity matrix. 


any two sets of bn that accomplish this being related by an orthogonal trans¬ 
formation. Then (35) become.s 


(39) 


and (34) hecome.s 


(40) 


M / M \ (i ^ 

” S (52 = 52 , 

/ kr-.l 

M fl fi 

C/0 “ 5 'j C/^.Cr^ 5 ^ ('Ih 5". 

k—1 Jt—1 

= 22 22 Cn h”''' 

7n—1 A.W.I 

" 5 1 Cim (Im J I — X ' * * V fLj 

IflmX 


where 5"" 
(5l.m)“" = 

( 41 ) 


are such that = 8mi, the Kronecker delta, or, in matrix notation 

ib’’”').' Next, the equations (40) can be made into an orthonormal set 

// ^ / 

“* ^ 

Wl—l 



A TWO SAMPLE TEST 


253 


i.e., one in which 
(42) 


/ ^ Cfcffi Cim — offi 


by ix non-singular linear transformation on the c'lm • Clearly 2eio^ is an invariant 
of (41), i.e., it does not depend upon the choice of a particular transformation 
( 37 ), or of a particular tiansformation the c(„, into c'lm , since, in both cases, 
all admissible transformations are connected by an orthogonal transformation. 
Then xve define 


(43) 

(44) 

in such a way that 
Then 


/ij 


5-1 




i = \, ■ •, M 


2/ij * /X "t“ 1, • • • , Tfl 


5—1 


C::) is an orthogonal matrix which is possible, by (38). 

fn m 

Ey'>j = 52 2(5 ^2/5; = 12 8.5 52 


(45) 


(46) 


( 2«1 






== 2 flfc 52 2 . 58*5 = for f = 1 , • • ■, n, 
*-i 5-1 

m tfl M 

^%q ^kq 
q-1 11=1 

m 

= £ a* d,q!Zkq = 0 for i = ^ + 1, 

Jlf^l g=-l 


m. 


Finally we define 
C47) 


ff t 

Vxj = Vxj, 


i ~ ^ \ , m 

(48) 2/»J i = Ij ' j ^ 

m=l 

fi 

(49) Z/ij ” <3^1712/fnj j ^ r d“ 1, - ^ 

mB=l 

wlieic the are such that ( ) is an orthogonal matrix Since the transforma- 

tion applied to the y^j to obtain ylj is orthogonal, the are independently 
normally distributed with variance Also 


(50) 

( 51 ) 


= 0, 1 = *1 -h 1, 


^y tl 5 j Cim Om OiO , % 1 , ' ' ' I 1 

m=l 

ij ~ 5 ] CviM dm , 


(52) 


i = r + I, • , M- 



254 


CHARLES STEIN 


Since (50), (51), (52) were obtained from the original formulation by a non¬ 
singular linear transformation, the derivation can be reversed, which implies 
the equivalence of (50), (51), (52) to the problem as originally formulated. 

Thus we can restate the problem in the following manner. Let i/,,, i 
, i, j = 1, 2, ■ ■ ■ be independently normally distributed with variance cr“ 
and means 


(53) 


^Ua = £< I i = L • ■ ■) M 

Eyij = 0, i = /X -b 1, • • • , 1, and o-^ unknown. 


We wish to test 


(54) 


= 0,i = 1, 


the for i = p + 1 • • • and being nuisance parameters. 

Obtain a first sample 2/<j, f = 1, ■ • , t, j = 1, ■ • ■ , no. Estimate the vari¬ 
ance by 


(65) 



' no t 




t-I 



Let e be a prede tea mined constant, and n be defined by 


(56) 



-h 1, no -b 1 


After s* has been obtained, determine a set of real numbers, ai ■ ■ • a„, in accord¬ 
ance with a preassigned rule, so as to satisfy 


(57) 


2ay = 1 
.s'2a* = 0 


Then 

( 68 ) 


ffll = • • • — Ono • 

V / n 

ElEoyi/.; 

0(nof - p) 


liiis the non-central E-distribution given by (34) with n = Hot — n, m = f and 


(59) 


E c* = E tV (no t - /i)0, 


whore are the true moans, allowing for the possibility that Ho is not true. For, 
{not — /i)s”/tr‘ has the distribution of Xnoi-fij ™d, after it has been determined, 

n 

E i = 1 • • • r, are independently normally distributed with mean 0 

3"=«] 

and variance aha] = a^e/s^, so that, given s^, ^ = 1 ■ ■ ■ V 



\ TWO BAMPLM TEST 


256 


are independently normally distributed with meau 0 and variance o^/s^. But 
the random variables , in section 3 are of the form a,/\/r wheie the are inde¬ 
pendently normally distributed -with mean 0 and variance, o-', while rfa has the 
x5,o(-m distribution independent of the x.. Thus U can be consideied to have 
been obtained by first selecting a stochastic variable r such that r/a" has the 
distribution of Xno'-^i and then selecting t, to be independently normally dis¬ 
tributed, given r, with mean 0 and variance &^/r. Since r corresponds with 
(not — n)s‘, comparing this with the above, we find that 


(60) 


i-l _ 

V0 s/That — ft’ 


have the same joint distribution as the U . 
that 


i = 1 ■ ■ ■ p 


The 


V (Wfit — s)s 


are constants, 


so 


(61) 


Y fv _ t Y 

Zj ( 2^ J ^ cl 

pft _ f 1 _^ 1 3—1 __j_ „ I 

z{nat — p) ‘“1 l-v/ s{not — p) ’\/zinot — p)) 


p 

has the same distribution (34) as 2 ~ = ^•/V(«oi! — p)g . 

The tests of significance and confidence regions are obtained by a prooeduie 
completely analogous to that used in the case of Student’s hypothesis. If we 
define k = by 


(62) P{Fp,n„l—ii 1> &) —01, 

then a critical region of size a for testing Ho is given by 


(63) 

Its power function is 


(64) 1 - 


noj -^ 

V 





\ 

e{ru)i — p)/ 


Similarly, a confidence region for fv, * = 1 • • • p, of confidence coefficient 1 — a 
is given liy the set of all f, such that 


(65) 

where 


F'iii ■ • • fp) < fc, 


■ ■ ■ fp) 


p / n \2 

S a.,V.i - f.j 


0(nfl< - p) 


( 66 ) 



26G 


CHARLES STEIN 


It is evident that this defines the interior of the hypersphere 
(67) £ (f* - E 

1-1 \ j=i / 

whose volume is independent of the variance a'. 

The distribution of n, the required number of sets of observations for the 
above tests and confidence intervals is given by 


P{n = no + 11 - P<- < no + 1 


( 68 ) 

where 

(69) 

and 

(70) 


= P[(ti^I — < (no + l)(noi — ;a)e/(r®} 

= P{X5 < 2/1 = (y'2)*r(^5) I ® 


2/ = (no + l)(no« — ja)fl/ff“ 
3 = not — /I 


P{n “rl = P<»'<“ + 1<»' + 1 
' 0 


— P{(v — l)5^/(r < xS < 

"(VWrmlr -l)Si/#» 


du, 


for integral v > no + 1, all other values being impossible. 

Thus E(n) satisfies the inequalities 

(V2)W) {!'+ 1 ? '*“} 

(71) < P(n) 

^ (V2ym){l +1 

which' can be rewitten 

(no + l)P(7i:s < J/l + ~ P(xh 2 > 2/1 
0 

< E{n) 


< (no + l)P{x8 < 3/1 + g f’lxw > 2/1 + PfxJ > y\- 


(72) 



A TWO SAMPLE TEST 


257 


The modifications required to avoid wasting information are exactly analogous 
to those made in the case of the test for Student’s hypothesis. 


6. Non existence of a single-sample test for a linear hypothesis whose power 
is independent of the variance. The canonical form (see Tang [3]) for a linear 
hypothesis in the single sample case can be derived immediately from (53) and 
(54). lAjt a:,, z = 1 ■ ■ ■ n be independently normally distributed with means 


(73) Ex, =-- I., f = 1 -. p 

Exi = 0, t = p + 1 • • • n 

and variance The f,' and o' are unknown, and we wi.sh to test Hoif, = 0, 

i = 1 ■ ■ ■ p. 

The most powerful teat for Tfo against a given alternative = ^,o, f = 1 • • p, 
if the variance a is known, is that based upon the probability ratio (see Neyman 
and Pearson [4]) 


(74) 


^ _ (V <r) 
Po 


-..-S'(I 


(»»-?> o)*+ S *?} 

p+i i 


1 

/ \ n C I 

( V 2ir O') 



Since any strictly increasing function of pi/po is equivalent for this purpose, 
we can use 


(76) 


V 

tpixi * ' ^ ) fi0^» . 

I 


The critical region of size a based upon <p is given by 


(76) 


where 


WM =E{x 



> 


> 




p 

since, under iJo, 2 is normally distributed with mean 0 and variance 
1 

V V P 

12 • Under Hi, 2 is normally distributed with mean ^ fio and 

1 1 1 



268 


CHARLES STEIN 


variance a ^ f!o. 

I 

tion of a is 


Thus the power of the test foi the alternative Hi as a funo- 


i-M = = 



Now let us suppose theie exists a test based on the critical region W of size a 
whose power 1 - /3 is independent of a. Since lfo((r) is the best critical region 
of size a for any o- we must have 

(79) 1 - /3 < 1 - M ^ 

ij* 

so that 

(80) 1 “ ^ ^ S-i.b, [1 — ft(o')] = e dx = a. 

By interchanging Ha and Hi we can reverse the inequality (80), proving 

(81) l~p = a. 

Thus any single-sample test for a linear hypothesis whose power is independent 
of the variance has constant power equal to the size of the critical region. 

REFERENCES 

[1] Geobge B. Dantziq, “On the non-existence of tests of “Student’s” hypothesis having 

power functions independent of v," Annols o/ Math, StaL, Vol. 11 (1940), p. 186. 

[2] S. S. Wilks, Mathematical Statistics, Princeton, 1943. 

[3] P C Tang, "The power function of the analysis of variance tests,” Slat. Res. Mem , 

Vol. 2 (1938). 

[4] Neyman and Pearson, Slat. lies. Mem,, Vol. 1 (1936), 

[6] A. Wald, “Sequential tests of statistical hypotheses,” Annals of Math, Slat., Vol. 16, 
June 1946. 



COMPACT COMPUTATION OF THE INVERSE OF A MATRIX 

By Fbhderick V. Waugh and Paul S. Dwyer 
War Food Administration and The University of Michigan 

1. Introduction. Among the most common applications of mathematics to 
practical problems are the solution of simultaneous equations, the evaluation of 
determinants, and the computation of the complete inverse, (or the complete 
adjugate), of a given matrix. Even with modern computing machines these are 
laborious, time-consuming jobs. For that reason there has been great interest 
in recent years in the development of so-called “compact” methods; that is, 
methods that eliminate all unnecessary detail, that use computing machines 
to do as much of the work as possible, and that only require copying the results 
needed m further analysis. 

In 1935 a paper by one of the authors [1] and since then papers by the other 
.author [2], [3], [4], [5], [6] and [7] have outlined a variety of compact methods 
and have applied them to actual problems. These papers, together with other 
recent contributions, such as those presented m [8], [9] and [10], have resulted 
in much improved and more compact techniques in the general field of the solu¬ 
tion of linear simultaneous equations and allied topics, especially if the matrix 
is axi-symmetric. It is not generally recognized, however, that extension of 
these procedures (usually involving matrix factorization [7] [10]) can be used 
to compute the inverse (and adjugate) directly from the matrix factors without 
the necessity of the reduction of the unit matrix [11, 150] [2; 121] when the 
matrix is non-symmetnc. 

The present paper extends the use of compact methods in three ways. 

(a) It presents a method of computing the mverse (and adjugate) of a sym¬ 
metric or non-symmetric matrix by compact Gaussian methods without the 
formal reduction of an auxiliary identity matrix. 

(b) It introduces the method of multiplication and subtraction with division— 
a modification of the method of multiplication and subtraction—and shows that 
the terms recorded in the compact solution are themselves determinants which 
are minors of the determinant of the matrix. 

(c) It uses the method of multiplication and subtraction with division as a 
compact means of computing the exact value of any minor of the determinant 
of the matrix (whether symmetric or non-symmetric). It further shows how all 
cofactors of order n — 1 (constituting the adjugate) can be computed from a 
compact presentation of the calculations of the determinant of the matrix. 

2. Gaussian methods and notalion. Probably the method most generally 
used to solve simultaneous equations is the division method originated by Gauss 
[12]. Variations of this method are known as the Doolittle Method [13], the 
method of pivotal condensation [14], the method of single division [2; 104-112], 

269 



260 


FIIEDBHICK V. WAUGH AND PAUL S. DWYER 


and the Grout method [8], The methods as outlined by Gauss and Doolittle 
are applicable only to axi-symmetric matrices (common to least squares theory) 
whjle a more general presentation, applicable to non-symmetric matrices as well, 
has been made by more recent authors. 

The compact form of this method, extended to apply to the non-symmetric 
matrix, used m this paper is us follows; 

Given the matrix 


n = (OrJt) = 


ail 

“12 

0)3 

■ ■ Ol,, 

Oil 

Oil 

023* 

■0211 

Oil 

“32 

“33* ’ 

■ *03,1 

.®nl 

Oni 

0^3 • • 

’ ^nn. 


we compute 



= 

bri.l = 

(3) Qafc.ij = 


Ois 0)3 • • • Oi,, 

Om.I 023 1 ' • •Ojn.l 

f>3il Oa3.12‘ • *03,1.12 


& 7 i 2.1 &n 8 , 12 * l>nn 12 -.ii-l, 

Ori/Oil 

Oik ~ btiOik 

Ori ~ brlOii)/<lit,l 

Oik — bttOik — hii.iOii, 1 


bfi.n — (OrJ — hrlOl3 — hr2.l028.l)/03; 


and in general 


(4) 


Ork'U' 'j — Or*.12 •■i-\ — 


a)t-12 i-lOryVl ■ i-\ 


brk 12" •) 


Or *-12 "j 

0**.I2. ., 


It should be noted that Grout’s presentation [8] is sirailai to that used here 
except that Grout divides the elements of each row by the leading element while 
wo divide the elements of columns. 

The notation used above, introduced by one of the authors [2], parallels that 
used extensively in multiple correlation and i-cgression theory. It differs some¬ 
what from the notation used by Gauss.- See [12; 69] 

Since every h is the ratio of two o’s it follows that every b can be written in 
terms of a’s so that the formulas can be written in terras of a’s alone. This is 
what Gaus.s did although he used [ ]’s instead of a’s. Gauss also used letters to 
indicate the primal y ,'Subscripts and a single secondary .subscript to indicate 





INVERSE OF A MATRIX 


261 


the number of eliminations Tlius our 0521 was written by Gauss as [ 6 &, 1 ] and 
Gas 12 appeared as [cc, 2 ], 

It is in the interest of less extensive notation and it makes our notation some¬ 
what closer to that introduced by Gauss if we replace 

Orh 12 ■ 1 by Orl ( 2 ) 

^rkVi- } by l)ri ( 2 ) 

This shortened notation can always be used when the secondary subscripts 
include all the integers from 1 to j In this modified notation the formulas (4) 
become 


(j) — (j~l) 

(6) 

r _ Ori t2) 

Ork ( 2 )-• 

flu. (2) 

3. Solution by matrix factorization. The values of matrix (2) are in general 
not final answers to proposed problems but they are values from which final 
answers can be computed. The matrix ( 2 ) exhibits essentially both the triangu¬ 
lar matrix of the Gri.( 2 ) ‘'vhich we call t and the triangular matrix brk-a) which 
we call 8 . (The diagonal entries of the 8 matrix are all unity and do not appear.) 
Hence ( 2 ) is really 8 — 3^ t. 

A basic projierty, useful in most problems involving the UlSb of (2), is that 8 
and t are factors of a. Thus 


djk (2-1) arj-u-i) 

a,j (j_i) 


( 6 ) (t = 8 t and a — 8 t = 0 . 

That this is true in the symmetric case was proved in an earlier paper [7, 85]. 
That this is also true for the non-symmetnc case is now shown in a similar 
manner. 

Let tj be a matrix {n by n) with the first row composed of elements aih and 
all other elements 0 Let 81 be a .similar matrix with first column elements 6 ri = 

— and all other elements 0 Then a — 8iti = Ci = 0 is a matrix {n by n) 

an 

with all elements of the first column, and first row 0. 

Next let fc be a matrix {n by n) with the second row elements and all 
other elements 0. Let 82 be a matrix (n by n) ivith second column elements 
hr 21 and all other elements 0. Then ai — 82 t 2 = 02 = {an 12 )) is a matrix (a by a) 
mth each element of the first two columns and first two rows equal to 0 

This process is continued through 71 siiccessb'e .steps, an additional row and 
column being made identically zero at each step. We have then 

(7) a — 8itl — 82 t 2 — • - — 8ntn = n„+i = 0 
Now consider the triangular matrix 

t = ti 4- t2 -b ta "h + tn 



262 


FRKREIUCK V. WAUGH AND PAUL S. DWYER 


with its rows composed of the non-zero rows of t. Consider also the triangular 
matrix 3 = Si -j- Sj -)-•■■ -t- S„ . Then St = Siti -f- Sati -I- ■ • • -j- Snt„ since 
Sit, = 0 for i 9^ j] and (7) becomes 


a — St = 0 or a = St. 

4. Gaussian computation of inverse (and adjugate) without formal reduction 
of auxiliary identity matrix. The inverse of a, 0 “' = G = (Crt) can be calculated 
directly from the matrices S and t of (2). The adjugate T) = (drt) can be calcu¬ 
lated by multiplication by the determinant of the matrix and this can be calcu¬ 
lated by the well known formula 

(8) A = anCha-iOis-ii) • * • Onn-(n.i) . 

The theory is presented in some detail and illustrated for the case n = 4 after 
which a more general matrix presentation is given. The matrix equation 
oG = 3 f is equivalent to the following 4* simultaneous equations in the 4* un¬ 
knowns (crt): 


(9) 


fc — l k = 2 fc=3 


flu Clfc + Uw Cii -b fflij Cik -{- Gu Cik ~ 1 

Gjl Cu + Gi2 C 21 Oia Cik + Cik ~ 0 

flai Cik + Oiyi C2k + Gaa Czk + o-u Cut = 0 

Gn Ciic -b Gii Cik + Gia Cik + 0-M Cik = 0 


0 0 0 
1 0 0 
0 10 
0 0 1 


Now since Go = 3 also we have a'G' = 3 and there results another set of 4"'* 
equations in the 4* unknowns (c,*). 


r'=2 r“>3 r‘=4 

Gu Crl -b Gsl CrJ "b G 31 CtI + «U Crf = 1 0 0 0 

(10) Crl -b Gja Cra -b Gja CrS -b Crt =0 1 0 0 

Gla Crl 4" Gja Cr 2 "b Gaa Cr3 "b G^a Cr 4 = 0 0 1 0 

Gm Crl -b G34 Cr 2 -b G 34 Cr3 4" Crl = 0 0 0 1 

Tisher [11; 160] has shown that the equations (9) could be solved by reducing the 
unit matrix on the right. One of the authors has shown how to calculate the 
inverse of a symmetric matrix by Gaussian methods without reducing the unit 
matrix [1]. We now show how to reduce the non-symmetric matrix similarly. 
By the same process used in getting from matrix (1) to matrix (2), we can reduce 
the 4“ equations of (9) to the 4*“ auxiliary equations below. 


( 11 ) 


fc -1 

Gil CiJ: + Gia Cik + Oi3 Cgk -b On Cik = 1 

Gsa.i Cjft -b 0531 Cal; -b Gm 1 Cu, = * 

088 cs) CsJ, -b Gal. ( 2 ) Cik = * 

On (3) Cik = * 


fc-2 k-3 k’^i 

0 0 0 

1 0 0 

♦ 1 0 

* * 1 


The terms marked * can be computed by the process. However if we do not 
compute these terms we have ten equations with the right hand terms either 
1 or 0 . 



INVEHSE OP A MATRIX 


263 


In a similar way the 4=^ equations of (10) can be reduced to the 4“ auxiliary 
equations below. As above we may neglect the calculation of the diagonal 
terms, and of all terms below the diagonal, and still have six equations (with 
terms on the right zero). 

r=l r = 2 r=3 r=4 

Crl + 621 Cr 2 + i >31 C,3 + bu Crt = * 0 0 0 

(12) Cr2 + hs2.1 CrS + bja 1 Cr4 = * * 0 0 

Cr3 + i>i3 (2) Cri — * * * 0 

C..= * * * * 


The ten equations of (11) with the six equations of (12) are sufficient for de¬ 
termining the inverse matrix Solve (11) for k = 4; then solve (12) for r = 4; 
then solve (11) for fc = 3; then solve (12) for r = 3; etc. Each equation can be 
solved completely on the machme to give a value of a Cr* . 

It should be noted that Gaussian methods are approximation methods since 
they arc division methods. For a discussion and treatment of the errors re¬ 
sulting the reader is referred to papers by Hotelling [9] and Satterthwaite [10] 
to which further reference is made in the next section. 

Different forms for presentation of the results may be used. We suggest 
the following form which presents first the matrix (1), then the terms of the 
matrix (2). The terms of the matrix S' are then computed by (11) and (12) 
and placed diagonally adjacent to the terms of (2), The transpose of 6 is used 
so that the check multiplication by a may be most easily accomplished. The 
result of this multiplication which next appears shows that the computed value 
of a is correct to three places. The final matrix of Table I gives the value of 
the ad]ugate, T), as found by multiplying each element of the inverse 
by (26)(52 308)(39.356)(43.071) = 2,305,300 (to five places). 

It is possible to check the accuracy of the entries of each row and column 
of the matrix (2) separately by using a check sum to the right of each row and 
at the bottom of each column. We have not taken the apace to show check 
sums and they are not particularly needed after one gets a little practice with 
the method. In any case should be computed as a final check. 

A more general matrix presentation results from the use of (6). The matrix 
equation oE = 9 becomes gtE = 3 imd hence the auxiliary equation becomes 

(13) tE = 


Now since 3 is triangulai with umt diagonal terms and zeros above the diag¬ 
onal, it follows that also has unit diagonal terms with zeros above the diag¬ 
onal. Hence we can select equations from the equation of (13) 

2 


which demand no further knowledge of the entries of 6“^, A similar treatment 
of the matrix equation a'E' = 3, t'0'E' = 3 and 


(14) 8'E' = (t')"" 

yields ^ ^ equations involving zero terms of (t')"^ 


These two sets of 


2 



264 


FREDERICK V. WAUaH AND PAUL S. DWYER 


equations taken together in the proper order are sufficient for calculating the 
values in the inverse. 

It may be of interest to note that this is also a procedure for calculating 
when t and S are knoivn without the calculation of t“^ and 8“’ separately 

since 

(15) S = a“* = 


6. The method of multiplication and subtraction with division. We now 
present a different method, based upon the work of Hermite [15] and Chid [16] 

TABLE I 

Suggested form for calculation 

26 

-10 


15 

32 

19 

46 


-14 

-8 

-12 

16 


27 

13 

32 

29 


-35 

28 

26 

-10 


16 

32 

.02873 


-.00696 

.01826 

- .00283 

.73077 

62,308 

— 

24.962 

31.386 

.02436 


.01239 

.01440 

- .02267 

- .46164 

21765 


39 356 

34 600 

- .02302 


,01672 

00791 

.01991 

1.23077 

.78970 


- .86763 

43.071 

- .01619 


00419 

-.02041 

02322 

1.000 


0.000 

0 000 

0 000 

0.000 


1,000 

0.000 

0.000 

0.000 


0,000 

1.000 

0.000 

0.000 


0,000 

0,000 

1.000 

66231 

-16046 

42072 

-6624 

66157 

28663 

33196 

62261 

-53068 

36239 

18235 

45899 

-35018 

9669 

-47051 

63529 


together with important modifications suggested by the work of Dodgson [17]. 
Current presentations of the basic method include the "method of condensation” 
[18; 45-48] and in compact forms, the "method of multiplication and subtrac¬ 
tion” of one of the authors [2; 197-202], 

In Gaussian methods we divide each element of a column by the leading 
(diagonal) element of that column. In the method of multiplication and 
.subtraction we use the leading element as a "pivot” forming a number of two- 
rowed determinants. Thus we use the leading elements as multipliers rather 
than as divisors. No divisions are made in this method. TMs is a very real 
advantage when the elements of the original matrix contain only two (or three) 





INVERSE OF A MATRIX 


265 


digits each and when w < 7 (or 5). In such cases we can use this method to 
compute exactly the values of any mino^ of the determinant of the matrix and 
even the adjugate itself. 

It is perhaps well to mention here that error control is difficult -with division 
(Gaussian) methods. Even if many significant places are carried the errors 
may be significant, cumulative, and difficult to measure. The techniques 
suggested by the papers of Hotelling [9] and Satterthwaite [10] are most useful in 
developing error control in matrix calculation. However, where accuracy is 
important, and when the number of digit,s is not excessive, there appears to be 
merit in calculating the exact values. 

In the method of multiplication and subtraction, we compute fiom the matri.x 
(1) the following matrix 


(16) 


where 

(17) 


dll 

dl 2 

di3 

■ • din 

flsl 

A22 1 

A23 1 

■ • Ain.) 

an 

.A 321 

-das ( 2 ) 

• • -dsn-(2) 

-dnl 

An2 1 

AnZ ( 2 ) 

*■ ' ^nn Cn—I)- 


= Ouark ~ CnOrl 

^rk (2) = ~ Asfc.i.i4.r2.i 


and in general 

This notation is similar to that used in connection with Gaussian methods abovt. 

In the method of multiplication and subtraction with division, we compute 
from the matrix (1) the following matrix: 



dll 

O 12 

di8 

• • Oi„ ] 


021 

Bzz-i 

B 23 1 

• • -Bsn.l 

(18) 

031 

B 32 1 

Bzs (2) 

• • Btn-m 


-d„l 

B„2 t 

B,a <2) 

‘ —1)- 


where 


(19) 


Brk 1 = OllOrt — OlifcOrl 
Tj BmErk-l — 

.Of*. (2) = - 


ttii 


D B 33 (,2) Brk 12) — Bzk.^2)Bfi (2) 

Drh 13) = -=5- 

1522 I 


and in general 

( 20 ) 

with Brk-i and Brk ( 2 ) as defined in (19). 


D Bjj (j-i)Brk-lj-X) — Bjk (]-l)B 

Ork <)) - 5 

■Oj-l.J-1 Cl-2) 





266 


FREDERICK. V. WAUGH AND PAUD S. DWYFJl 


In geneial the method calls lor the calculation of entries according to the 
method of multiplication and subtraction but in addition calls for the division 
by the leading element of the second preceding row or column. Since this 
division must be exact, as is shown in the next section, we have at each stage 
a good numerical chock on the work as well as an exact valuo of the entry. Fur¬ 
thermore it is shown in tho next section that the value of iSri'cn is the exact value 
of the determinant 


On 

0)2 

Ol3 ‘ 

• Oif 

0)2 

021 

022 

(hi • 

• 02, 

02). 

Oa 

O 32 

O 33 • 

• O 3 / 

O 3 ), 

a,i 

Oj2 

0,3 • 

• a„- 

0,). 

On 

0,2 

0,3 • 

• Or, 

On 


All the recorded entries (themselves values of determinants) are caloulated on 
the machine. The only limitation is the number of places the machine provides. 
For the trivial problems (composed of small integers) found in most texts of 
College Algebra, one can calculate the values readily without machines. For 
example the determinant 


2 

1 

-3 

4 


2 

1 

-3 

4 

3 

2 

2 

1 

yields at once 

3 

1 

13 

-10 

-2 

-1 

1 

3 

-2 

0 

-2 

7 

4 

-3 

2 

1 


4 

-10 

73 

-397 


and the value of A is —397. All the other entries are also minors of A, 

Dodson introduced a method of multiplication and subtraction with division 
as early as 1866 [17], He however used a moving pivot For our purposes it 
seems preferable to use a fixed pivot as we suggest in this paper. 


6. Proofs of theorems involving the B tk (j> 

(o) First theorem. We first prove that the numerator “ 

(y_i) in the definition of is exacily divisible by the denominator 
. To do this we expand the terms of this numerator of (20) with 
the continued use of 


( 22 ) 


Brh tl-1) 


a-i) 


(wliich is (20) with j replaced by j — 1) and then we multiply and cancel. It 
is found that is a factor of all non-cancelhible terms so the exact 

divisibility is proved. 

(6) Second theorem We next prove that is the value of the determinant 





INVERSE OP A MATRIX 


267 


(21). We Dlustrate first for j = 3 and then give a more general proof. When 


On 

012 

Ol3 

Oli 


On 

O 12 

fll3 

Ol* 

021 

022 

023 

O 2 I 1 


0 

Bn 1 

528-1 

Bik 1 

O 31 

082 

O 33 

Oajt 

~ ah 

0 

Bti 1 

Bu-i 

Bik 1 

0,1 

0,2 

0,3 

Or* 


0 

Bfi-i 

5,3 1 

Brk 1 


522.1 

523,1 

52* 1 

1 

Bn 1 

5a3.i 

58*-1 

5221 

5,2.1 

5,3 1 

5r*.l 


fisfc (2) 

Brk (S) 


— Brk.(t) . 


In the moie general case we designate the determinant (21) by | a,a | and reduce 
the order by the “condensation" method just illustrated. It is understood 
that the values of Brh., used in the following proof have primary subscripts 
larger than secondary subscripts since the rank of the resulting determinant 
decreases with each condensation 


(23) 


I ~ I 11 ” I (» 

flu on 1 

* I Bfk (8) I 

On (2) 


■^^-1.3-1 (J-2) 


1 Brk' (3-U I = Brk 


U) ' 


It is to be noted that the first theorem, since each Bri (/> can be intei preted as a 
determinant by the second theorem, is a corollary of a well known theorem 
[19; 33]. In a conventional determinantal notation it might appear as 

(24) AAyj; rj ~ ArtA,/ 

where the first subscripts indicate deleted rows and the second subscripts deleted 
columns. 

(c) Third theorem. We next relate the values of Brk.^j) and the values Ort 
and brk. 01 • With the use of the second theorem (23) and (8) we have 


(25) 


Brk O) _ fflll fl22 iflaa (2) • • • fljj'd-l) ^rk Ql 
Ork'd) 


= B 


33 (3-1) 


and with the additional use of (4) 


(26) 


Brk (J) _ 011 022 1(133 (2) ‘ ' UjJ (j-1) drk (j) 

brl'd) Ork (,) 


— Bkk O) . 


These formulas may be written in the form 


(27) 


Brk (3) Bjj (^1)0,1;. (j) 
Brk O) ~ Bkl-Oll^rk-if) 



268 


FItKDEltICK Y W\rail AND TAUD H, DAVYKU 


and since Bjj (j_i) and -^^ 1 , 311 .( 7 ; nrc diagonal tenna, it follows that the matrix 
(18) can bp obtained from the matrix (2) by multiplication by diagonal matrices 
(d) Fourth Theorem A fourth theorem gives explicit matrix formulation 
to these results and shows how the values of the matrix (18) can be used m 

factoring the matrix (1) Now (27) and (28) can be written in the form 

(29) Z = aUrt 

(30) '5 = 

where 2 )lr is the diagonal matrix which multiplies t to get I and 911s is the 
diagonal matrix which multiplies 8 to get 0 I'he values of the E matrix are 
the values of (18) with r ^ k while the values of the 0 matrix are the iniliies of 
(18) with r ^ k The diagonal matrix SKt is composed of diagonal elements 
[ 1 , flu , B 22 ] ‘ („- 2 )] while the matrix 9W, is composed oi diagonal 

elements [flu j 1 , B^^ • • /?„„ The basic matrix factorization equa¬ 

tion ( 6 ) tihen appeals as 

(31) a = 

It is to be noted that exae.t I’lilnes of elements of all these matrices are avail¬ 
able if the inverse diagonal matrices are written m fractional form, subject of 
course, to practical limitations .sueli ns number of places of eompiitiiig machine, 
etc. 


7. Computation of the adjugate matrix. We now present matrix fonniilas 
which enable one, to compute the adjugate of q compactly with the method of 
multiplication and subtraction ivitli division. If (9) is the determinant of a 
and ® is the, adjugate of n, wo have 

nT) = I a 1 3 
StE = 1 a 13 
tS = I a I 0“^ 

9)^tE = im,! a 1 0“' 

(32) EE = an, I 010“' 

and similarly 

a'E' = I a 1 a 
t'0'E' = I a 13 
0'E' = |aj(t')‘^ 

an:0'E' = an. 101 (t')" 

( 33 ) 0 'E'= an, |a| (t')"'. 

The computational procedure in getting the adjugate is very similar to that 
used in getting the inverse in section 4. E and 0 aie triangular matrices while 



INVERSE OF A MATRIX 


269 


and t ai'C the matrices used before Tiie values of Uu , , ‘ ■ 

(n- 2 )] SD^.[au, B 22 - 1 , Bai.( 2 ), ■ ’^nnin-i)] aud lo| are first computed 
by (18) so that 9)?( [ a [ and 3)?, | a | can be calculated. Without further calcula¬ 


tion we aie able to select 


n(n -b 1) 


equations from the matrix equation (32) 


having known coefficients on the right ^vhich are zero^ and 


equations from the matrix eipiation (33) having zero coefficients on the right. 
These con.stitule the equations nenc.ssary to determine the values of drk. 
These values of d,/. can all be calculated directly on the machine and, what is 
more useful in discovering ealculational (UTors, the divisions yielding the dru 
must be exact. 

For a = -1 those a" ('quations are 


Uu da + U 12 
(34) ’ 


dsn + flia dan -b Uu = 

du -b Baa-i dill -b Bu t d^h — 
18.13.(2) dsk -b BsA.(2)dii, — 
Bu 2 d« = 


[ol 0 0 0 

* an I n I 0 0 

* * B 22 .i\a\ 0 

* * * i?33.(2)|ol 

r=lr='2 r = 3 r=4 


On dfi + Uai 

(35) ' 


dri + Osi dr3 + Clii dri — 

dri -b Ba'i 1 dr3 "b Bti.i drt — 

Baa (2) dr3 + j543.(2) Art = 


* 0 

4! * 

♦ * 


0 

0 


0 

0 

0 


The process i,s similar to that ot .section 4. An illustration for the case n = 4 
j.S given in Table II, The matiix of the B’a is directly below the matrix a and 
the calculated values ot the elements of 3?' (obtained by solving (34) and (36)) 
are placed diagonally in the cells with the B’a. The valiie.s of the transpose ot 
35 are used so that the check, promultiplieation by n, is easily carried out. The 
next matrix in Table II exhibits n35 = | a i 31. The last raatiix of Table 11 
IS a live decimal place approximation to (S' which rs obtained by dividing the 
entrie.s of 3)' by | a | Since we know these arc the correct five clecimal place 
values of (S', we may compare the corresponding values of Table I to see how 
much those are in error It should be noticed that the approximation to (S' may 
be readily carried to more than five decimal places if desiiod. 

As with the, Gaussian methods, it is po-ssible hei'o, also, to chock ea('h row 
ami column individually liy using check sums 

The woik necessary for the computation of the adjugate from the matrix of 
the B's can be shortened somewhat by the use of the fact tliat the adjugato is 
coinpoBcd of the eofaetors of (he (la ■ Xow the eofaciors of the four terms in 
the lower right hand corner arc (n-a) ; e'n-i.n = («-!); 

dn,n-i = —Ba,,,-} (n- 2 ) 5 aiid Ann = (,i- 2 ) aod tlif’.se aiv available from the 

calculation of the /i’s (hough B„„ („_s) is not recorded. (vSee the lower right 



270 


FUKDUltlCK V WAUGH AND PAUL S. DWYER 


four entries of the B’b and a’s in Table 11 above). With these four values 
immediately available, the use of but ~ 4 additional equations is demanded, 
or this additional information can be used in checking. 


TABLE II 

mfoi computation of adjugalc {with chccl) and then inverse 


26 

-10 

15 


32 


19 

46 

-14 


-8 


-12 

16 

27 


13 


32 

29 

-35 


28 


26 

-10 

16 


32 



66233 

-10033 


42069 


-6603 

19 

1360 

-649 


-816 



.50151 

28558 


33104 


-5225S 

-12 

296 

53524 


47056 



-53008 

36236 


18224 


45899 

32 

1071 

-45899 


2305327 



-35013 

9659 


-47056 


63524 


2305327 

0 


0 


0 


0 

2305327 


0 


0 


0 

0 


2305327 


0 


0 

0 


0 


2306327 


02873 

- 00895 


,01825 


- .00282 


.02430 

.01239 


.01440 


- 02267 


- 02302 

01672 


.00791 


01091 


-.01619 

00419 


- .02041 


.02322 


REFERENCES 

[1] F. V Waugh, “A simpUriecl method of determining multiple regression constants”' 

Jour Am. StaL Asaa., Vol. 8 (1936) pp. 69'1-700. 

[2] P. S. Dwydh, "The solution of simultaneous equations”, Fsychametrika Vol. 6 (1941J, 

pp. 101-129 

[3] P. S. Dwyeii, “The evaluation of doteimiaants”, Psyehometrika, Vol. 0 (1041), pp 

191-204 

[4] P S Dwyeh, “The Uoolittle technique”, Aimala of Math. iS'hit., Vol 12 (1941), pp. 

449-468 

[5] P. S. Dwyeii, “Tlie evaluation of linear forms”, Psyehometrika^ Vol 6 (1041), pp 

365-365 

[6] P. S. DwYEn, "Recent developments in correlation tcclniiquo”, Jour. Am. Slat. Assn.. 

Vol, 37 (1942), pp. 441-400, 

[7] P S. Dwykh, "A matrix presentation of least squares and correlation theory witli 

matrix justification of improved methods of soluUon”, Annals of Math, Slat., 
Vol. 15 (1944), pp. 82-89. 

[8] P. D Grout, “A short method for evaluating dctenniiiantB and solving systems of 

linear equations with real or complex coefficients”. Am. Institute of Electrical 
Engineers, 33 West 39 Street, New York City or Marchant Methods MM-182 
Sept , 1941 Marchant Calculating Machine Co , Oakland, Calif. 



INVERSB OF < MATRIX 


271 


(9) IIahom) Hotelling, “Some new methods in matrix calculation, “Annals of Math 
Slal., Vol. 14 (1943), pp. 1-34. 

[10] F E. Sattebtiiwaite, “Erroi control in matrix calculation", Annals of Math. Slat,, 
Vol. 36 (1944), pp, 373-387 

(111 R A. Eisiieu, Statistical Methods for Research Workers, 5tli edition, London, 1934, 
p. IGO 

|13) ('. F, Gau&.^, “SupplGinenlum theorie combinationis obsei vatiomiin crroribus minimis 
obnoxiao' , Gottingen, 1873, Werke Band IV 

[131 jM H. Doolittle "Method employed in the solution of normal equations and the 
adjustment, of tnangulation”, 17. S. Coast Guard and Geodetic Survey Report, 
1878, pp, 115-120 

(14] A, G. Aiticen, “Studies in jiraotical mathematics I. The evaluation, with applioa- 

. tions, of a certain triple product matrix". Froe. Royal Soc Edinburgh, Vol. 57 

(1937), pp 172-181 

(15] C Hermite, “Sur une question relative a la theorie des nombres”, Journal de Malhe- 

maiiQue puies et apphquees, (1849) i. Also Oeuvres Tome I, pp 265-273. 

(16] F. Gitic), “Memoire sur lea fonctions connuea sous le nom dc resultantcs ou de determi¬ 

nants”, Turin, 1853. 

(17] C L. Dodoson, “Condensation of determinants”, Proc. Royal Soc., Vol. 15 (1866), 

pp. 150-155 

(18] A. 0 Aitkbn, Determinants and Matrices, 2nd edition, Oliver and Boyd, Edinburgh, 

1942 

fl9] M. Bogher, Introduction to Higher Algebra, (1907), MacMillan Go,, 'New York, 



MULTIPLE MATCHING AND RUNS BY THE SYMBOLIC METHOD 

Irving K^plinwcy and John Riordan 
New York City 

1. Introduction. The two Kulijects in the title have generally been treated by 
distinct inothodH, an excellent Kummary of which is siven by H. S. Wilks in 
Chapter X of [13] Tor lAvo-dcek matching, an appreciable simplification over 
the classical work cif iVIacMahon [7], which seems to underlie the generating 
function used by Wilks [12] and Battin [2], has been shown by one of us [5] 
to follow from symbolic methods Here we give an elaboration of these methods 
to multiple matching and to runs. 

The ba,siH of the symbolic method in both problems has been given m [C], 
lint for completeness a skeleton resume is given m Section 2 below. A new 
point is stressed the relation of coefficients in polynomials of the symbolic 
method to factorial moments (cf, Fr6che.t [4]). 

The emphasis for tlic most part is on showing the expedition of tho symbolic 
method in reaching known results, but in several instances neiv results are 
obtained. 

2. Symbolic expressions and moments. Let A], • • , A„ be arbitrary events 
and let p(A„ , ■ • • , denote tho joint piobability of A,,, • • • , A^ ; let 
Pr be the probability that exactly r of the events occur Then 

(1) P,-£(-l)%as(-l/p(An, -^A,*) 

and in particular 

Po = i:2(-l)*p(A.„---,AJ, 

or symbolically 

(2) Po = [1 - p(Ai)][l - p(A,)]-- - [1 - pG4n)]. 

The cases to be studied will be exclusively ones where so-called qmsi-symmeiry 
holds, i.e., p(A), , • , A^) is either 0 or a function of k alone. In that 

event (2) can be evaluated as follows: suppress all products that vanish, and 
form a polynomial f{E) by replacing each surviving term p(A,) by E. Then 
Po = J{B)<fio where E is a displacement operator: . 

Tho same polynomial f(E) can also be used to obtain Pr and the momenta of 
the distribution. From (1) we see that P, = /(F)i/'q , wheic i/'fc = (—Ij'kC'rffn . 
Again it is well known (Fidchet [4]) that the fc-th factorial moment, defined by 

Mik) = !:»(*■-1) (f-^ + i)p^, 

1-0 


272 



MULTIPLE MATCHING 


m 


13 also given by 

= M'SpiAn, ■■ , A{^). 

It follows that the terms of JiE)4>a are essentially the factorial moments. More 
precisely, if 

m = ts,(~E)\ 

then 

(3) M(i) = klSii^k. 

3. Card matching. To avoid complications which add nothing to the funda¬ 
mental idea, the case of three decks will be considered explicitly. As remarked 
by Battin [2], there is no loss of generality in supposing that the three decks 
have the same number of cards: let them be numbered from 1 to a. Let p,jt 
denote the probability that the f-th, j-th, and k-th cards of the three decks are 
matthed, that is, all occur in say the l-th place. The condition of quasi-sym¬ 
metry is fulfilled, the (symbolic) product of h of the p’s being either 0 or = 
[(n - h)\/n\]\ 

The simplest problem is to find the probability that there be no triple matches 
of the form {%, t, z). Since no products of the expression 

(1 - Pin)(l - PS 22 ) ••• (1 - Pnnn) 

vanish, the answer is (1 — E)"<f>(i, in agreement with Anderson [1] (cf. also 
problem E 589 in the American Mathemattcal Monthly, p. 512, 1943; solution 
by John Riordan, p. 287, 1944). 

Suppose now that the decks are given compositions in the usual fashion by 
having Oi, 5i, ci aces respectively, 02 , 62 , C 2 deuces, etc. We may number the 
cards so that 1 , • ■ , oi are aces, Oi + 1 , • • , Oi -h 02 are deuces, and similarly 
in the other decks. The probability of precisely r matches among cards of the 
same denomination is then given by 

(4) Fiat , bi , ci)F(a 2 , bi, ct) ■■■ i/'o, 
where 

F(a, b, c) = n(l - p„ife) 

the symbolic product being taken over ranges i = 1, a, j = 1, b, 
fc = 1 , • • • , c. 

A simple combinatorial argument reveals that 

(5) F(a, b, c) = MaMbUcM-Ey/i! 

where (a) t = a(a — 1) ■ ■ (a — t -j- 1) is the Jordan factorial notation. The 
problem of matching arbitrary decks is thus compactly solved by (4) and (5). 



274 


IRVING KAPLAJ^flJCY AND JOHN RIORDAN 


4. Examples. Wlien decks of explicit structure are in question, the com¬ 
putation of probabilities and moments reduces to straightforward algebra, as is 
illustrated in the three following examples 

1. Suppose each of three decks has two suits of two cards each. Then, since 

Fi2, 2, 2f = (1 - 82? -f = 1 - 16£1 -h 722?“ - 64E“ + 
it follows that 

(4!)“Po = (4!)“ - 16(30“ + 72(20" - 04(10" + 16(01)" 

= 576 - 576 -0 288 - G4 -f- 16 = 240, 
and the calculation of (40“Pr may be set forth as follows; 
r 

0 576 - 576 + 288 - 64 -f- 16 = 240 

1 576 - 676 -t- 192 - 64 = 128 

2 288 - 192 -I- 96 = 192 

3 64 - 64 = 0 

4 16 = 16 

each column being obtained by multiplying its first row entry by a bmomial 
coefficient. These results may be verified readily by direct enumeration. 

2. In the case of three 6 by 5 decks, the polynomial is 

2i’(5, 5, 6)' = (1 - 126E -k 4000E" - 360002?” 

-t- 720002?^ - 14400/?y 
- 1 - 625E + 176,2502?“ - 29,711,2502?” 

+ 3,346,063,125P* • ■ ■ 

The factorial moments can be obtained using (3). 

2lf(i) = 625/26“ = 1, 

Jf(5) = 2-176250/25“ ■24“ = 47/48, 

21f(s) = 7923/8464, 

M^^ = 1784667/2048288, 

the first two in agreement with Battm [2]. 

3. The symbolic method can be applied to more intricate kinds of matching, 
as this final example shows. Suppose that the six matches represented by 
(123) and its permutations arc forbidden, likewise the six matches represented 
by permutations of (466), and so on in groups of three. Then 

(1 — pl2«)(l “ pm)(l — P218)(l — P2Sl)(l ~ P312)(l — Pa2l) 

= 1 - 6® -t- 6JS“ - 2E\ 



MULTIPLE MATCHING 


275 


and so the answer is 

{1 ~ 6E + 6E‘ - 2EY‘\ 

The analogous problem for 4 decks has the solution 

(1 - 24-B + mE^ - 965’ + 2iEY'*- 

The generalization to an arbitrary nunaber of decks involves the enumeration 
of Latin rectangles, in itself a formidable problem. 

6 . Moment formulas. It is possible to deduce from (4) and ( 6 ) fairly explicit 
formulas for the factorial moments. Let us define •u'® = (a) lib) t(c) t. Then 
( 6 ) may be written symbolically as 

F(a, h, c) = = exp (—uE). 

Writing F{ai , 6, , cj = exp (—u,E), we then have 

jPo = exp [—(mi + + • •-)E]tj)ii 

— + M2 + • • •)* ^ ^ '/‘O) 

or finally, if m + 1 decks are being matched, 

(6) Po = S,('-)'(«! + M 2 + •' -y/il (n)T. 

It is to be borne in mind that after expansion of (wi + + • • •) * by fbe multi¬ 
nomial theorem, the term Uiulul • • • is replaced by ■ with the 

u’b defined as above. 

By (3), factorial moments corresponding to ( 6 ) are given by 

(7) Mil) = (ui + M2 + 

Thus in particular 

= wi + 112 + • ■ ■ = 2,a,6i • • ■ 

n”'in — = (mi + ^2 + • • Y 

= S,o,(o, — l)b,(5, — 1) • ■ ■ + 2S„4^o,a;hj6^ • • • 

•the cases m = 1 , 2 in agreement with Battin [ 2 ]. 

In the simple case where m = 1 (two decks), a, = b, = a and n = sa, we have 
= (a)? and 

(8) = (w + M+ 

with su’s in the parenthesis. The right of ( 8 ) is the multi-variable polynomial 
of E. T. Bell [3], Y tiyi , 3 / 2 , ■ • ■ , y<) with y* = and (s) a symbolic factorial 
such that etc. Instances of ( 8 ) may bo compared with 

Olds [9]. 



276 


IllVINO KAPIiANhKY AND JOHN IHOUDAN 


Expanding (8) we obtain 

(») = («).lit'*']' + AW, W”J'+ • 

— "h iC’aWz-ia"* '^(a — 1)' + • • • 

and, since (s)(/(h), —> oT' as n -■> ■», it follown tiuit Mn) —> a \ i.e,, the limiting 
distribution is PoiKson n-ith mean a. As indicnti'd in [6] one may piocced to 
obtain HUCceHsh'e tciniH of an aKyinptolic, series for the distribution. These 
results generalize to the ease wh(>re May ~ lia.ln/n approaches a finite limit as 
n —> w. In certain instances where May —> <», asymiifotie normality can be 
proved (cf. [1] and [8]). 

6. Successions and runs. An shown in [6], enumeration of peunutationa with 
a specified number of 2-sucees,siona like 12, 42, ■ • may be accomplished by 
introduction of symbola like qa , qa , denoting probabilities that 1 immediately 
precede 2, 4 precede 2, resp. For permutations of objects of which are of one 
kind, oj of a second, • ■ ■ with ai + Os + • • • a, = n, the probability of exactly 
r 2-successions is {[0] p 914) 

(9) Pr = Gia{)G{a,) ■ r/(n,)i^n 

with 4 / 1 . — (—l)fcC',(?i — k)\/n\ and 

G(a) =2(aMa-l).(-/^)‘AI. 

lt»Q 

It is to be rioted that in deriving (9), elements of thi' first kind are numbered 
1 to Oi, of the second ai 4- 1 to ai 4- ■ and a succession occurs if either 

i precedes j or j precedes i with i and j in the Kamo set, 

For 8 = 2, i.e,,, two kiiuls of elements, there is a simpler foriuula due to Stevens 

[10] , but for the general case (9) seems to he the only reasonalily explicit solution 
known. In particular, for the function F(ai, • • • , a,) of Mood [8] which enu¬ 
merates the number of permutations with no 2-sucee.ssions, we have 

F(ai, ■ • , a,) = nICr(ai) ■ • - G{a,)4>(i • 

Factorial moments for 2-succe.s,sion.s are given at once by (7): 

(10) M^l) = (ui 4^ W2 + •• + 

with = (a,)j(a, — 1)/. 

It is more usual to classify pernuilation.s according to the uumher of runs, 
say r', a run consisting of a succcHsioii of t like clementH (i = 1, 2, ■ ■ • )■ Since 
every 2-succosHion causcK the loss of a potential run, W'O have r' ~ ii — i.e. the 
number of runs is n diminished by the number of 2-HucceHHionK. Factorial 
moments M(o for runs arc then given by the xasual formula for change of origin; 

(11) M(0 = 2 (-!•■ A(n - i),_. . 

<-0 

Examples. 1. Introducing a, for the i-th elementary symmetric function 
of the a’s. 



MULTIPLE MATCHING 


277 


dl = 0,1 CI2 da = n, 

at = flifla 4" OiOt "I" ■ • ■ 4" Oa-iOt , 

as = aiOjOa 4" • • • > 

we may derive from (10) and (11) the formula 

( 12 ) ~ 1 " 1 " ‘iaijn 

for the mean number of rune. The variance the same for runs and 2-succes¬ 
sions, is given by 

fto\ 2 nr \ sf 2as{2ai — n) — Qnus 

(13) a = M(j) 4- M^ll - M^n == — ^ -. 

For runs of two kinds of elements, formulas (12) and (13) specialize to those 
given by Wald and Wolfowitz [11] 

2. For runs of elements of a single kind, factors in (9) pertaining to other ele¬ 
ments are suppressed. Thus if a is written for oi , and terms in Oj, • • ■ , o« are 
suppressed, (9) and (10) become 

Pr = t?(a)^o, 

M(i) = (a)t(a - l)(/(n)t. 

Moments for runs are given by 

M(t) == 2 (.-lYiCiin - i)i-Mw = - a 4- !)</(«)< 

«-a 

in agreement Avith Mood [8]. 


REFERENCES 

[1] T.W. Anderson, “Oncardmatching,'’-4.Ti7ioJ8o/ Math. Slat.,yo\. 14 (1943),pp. 426-436. 

[2] I L. Battin, “On the problem of multiple matching,” Annals of Math Slat., Vol. 13 

(1942), pp. 294-305. 

[3] E. T Bell, ‘Exponential polynomials,” Annals of Math., Vol. 36 (1934), pp. 268-277. 

[4] M FhAchet, “Lee probabilitds asBOoi^es 6 un syetfeme d’6v6nementB compatibles et 

dependants,” AcCuahtis Scientifiques et IndusIrielles, no. 869, Paris, 1940. 

[5] 1. Kaplanskt, “On a generalization of the probleme des rencontres,” Amer. Math. 

Monthly, Vol. 46 (1939), pp. 159-161. 

[6] I. Kaplansky, "Symbolic solution of certain problems in permutations,” Bull. Amer. 

Math. Soc.,'Vol. 60 (1944), pp. 906-914. 

[7] P A. MacMahon, Combinatory Analysis, Cambridge 1915, especially Vol. I, pp. 99-114. 

[8] A M Mood, “The distribution theory of runs,” Annals of Math Siat., Vol. 11 (1940), 

pp 367-392. 

[9] E G. Olds, “A moment-generating function which is useful in solving certain match¬ 

ing problems,” Bull. Amer Math. Soc., Vol 44 (1938), pp. 407-413. 

[10] W. L Stevens, “Distribution oi groups in a sequence of alternatives,” Annals of 

Eugenics, Vol. 9 (1939). 

[11] A Wald and J. Wolpowitz, “On a test whether two samples are from the same popu¬ 

lation,” Annals of Math Stat , Vol. 11 (1940), pp 147-162. 

[12] S. S. Wilks, Statistical Aspects of Experiments in Telepathy, a lecture delivered to the 

Galois Institute of Mathematics, Long Island University, Deo. 4,1937. 

[13] S S. Wilks, Mathematical Statistics, Princeton, 1943. 



ON THE POWER FUNCTIONS OF THE E^-T^ST AND THE T^.test 

By P. L, Htjxj 
National University of Peking 

1, The general linear hypothesis. Every linear hypothesis about a p-variate 
normal population or several such populations having common variances and 
covariances is reducible to the following canonical form [4]: The sample distri¬ 
bution, when nothing whatever has been discarded from the whole sample, being 

I r-1 

V n y 

('ll iVif - V<r)(i/yr - Vyr) ' J S «<; S U dj/ (fo 

^ ' i.i-i .-i J 

(n > p), 

where the »;,> and the a,, are unknown, the hypothesis to be tested is 

Hi rnr = Q (f = 1, • • ■ , p; r = 1, • •' , ni, rii < m). 

It is clear that the (i = 1 , p; r = ni-fl, • • • , w) can have no use. 
Also, the only useful quantities supplied by the set g,-, are the statistics 

b(f ** 2(«2y« , 

because the remaining quantities may be regarded as a set of angles which are 
independent of y,v and the Ij.y and which has a known distribution free from any 
unknown parameter in (1), [2]. After discarding the irrelevant j/’s and the angles 
there results the reduced sample distribution 

K 1 a,y 16.y exp (-i i: a,j 

[, M-1 

nt P ^ 

12 {y^r — yi^rKVir - vjr) “ I 2 B dy dh. 

r-1 ir?"”! J 

Hereafter the indices i, j and r shall have the following ranges: 
b i = 1, ••• ,P, r = 1, ,ni, 

and the convention that repetition of an index indicates summation will be 
adopted. Writing 

aij ^ VirV^ I * Oiij -|- l){j , 

we obtain the distribution of the y,T and the c,y: 

Klayyl*'"'-^"’ lc.y - a.-, j 

exp d- ^otIjTjij'T] jf)YL dy dc, 

278 


(2) 



POWER FUNCTIONS 


279 


In the remaining two sections of this paper we deal exclusively with the 
special cases p — 1 and ni = 1. According as p = 1 or ni = 1 we shall drop 
the indices i and j or the index r. 

The case p = 1. When p = 1, (2) reduces to 

(c — PrPr)*" exp (—^aC + aprijr — ^oiijrnr) dcH dy. 

Putting pr = c^Xr we obtain 

(3) _ a:,ar)‘”“^exp (~ ^ac + ac^Xrtjr — |a7iri7r) rfcllda;. 

The hypothesis H is now 

H': vr = 0 (r = !,•••, ni). 

If w is any critical region for the rejection of H', denote by w{c) the cross 
section of w for every fixed c. Then the power function of la is 

fiwirii a) = fiwivi, ■ ■ ■ , T/m, a) 

do f (1 - dx. 

Jo •'iD(c) 

It is known [3] that, in order to have 

(5) «) = « 
for all a, it is necessary and sufficient that 

(6) f (1 — dx = Ae, 

Jw(c) 

where A is a constant. 

The E^-test is the test based on the critical region 

Wo •- XrXr = C~^yryr = E‘ > COnSt. 

The author has proved [3] that of all the critical regions which satisfy (5) and 
whose power function is a function of arjrVi- alone, the region Wo is the uniformly 
most powerful one. This result is generalized by Wald [7], who proved that, of 
all the regions satisfying (5), the surface integral 

yv(a, ^) = / Mvj «) dA 

is maximum when w is wo. The author gives here another proof of Wald's 
theorem which is easier ae it dispenses with the somewhat intricate Lemma 1 
of Wald. From (4) w'e have 

7„(a, X) f 

Jo 


/ (1 — XrXr)*" ^Udx exp (—|a7),7)r + X,r]r) dA. 

JwM Jllrll,—X 



280 


P. L. HSU 


By means of a rotation in the space of (iji, ■ • • , we can obtain 
/ exp + aC^XrTJr) dA 

= [ exp (-Mrfr + ac*(a:rXr)*fl) dA = 2^aka'‘(cXrXr)\ 
Jfrrr-^ *-0 

where oi depends only on a, k and X. Hence 

(7) 7»(«, X) = i: 6* r dc f (xrXr)^Xl - dx, 

Jt-aQ Jfl *'I 0 (C) 


where 6* depends only on A, a and X. Since wCc) satisfies (6), it follows from a 
lemma of Neyman and Pearson [5] that 


f (a:rXr)*(l — aJrXr)*"'* 
•^w(c) 


n dx 


is maximum, for all c and k, when io(c) is the region XrXr > const., i.e. when w 
is itself the region x,Xr > const. This proves Wald’s theorem. 

Still another optimum property of the ^^^-test may be established on using 
the volume integral instead of the surface integral. This is stated in the follow¬ 
ing theorem. 

Theobem 1. Let S he any linear set and let 

tpwiot, 3) = fiuin, a)Ii drj- 

Of all the regions satisfying (6), the region Wo has the maximum (p^ioc, S). 

For, by the same computation which leads to (7), we easily obtain 

^„(a, >S) = E C* r dc [ (XrXr)’’(l - XrXr)*”~'n dX, 

fefaO Jo JiO{o} 


where c* depends only on k, a and S. Hence the result follows. 

This theorem also contains my previous result as a consequence. For, writing 

Mv, a) = fiaJirVr), Pwtiv, a) = faiarirVr), 

we have 

^ ^ (/o(«’?r1)r) - /(a»;r»)r))n dt) = ^ ^ " /(«<)) dt. 


Since S is arbitrary, we must have fiat) < fniat). 

The case ni = 1. When ni = 1, (2) and H become respectively 

V I ~ |t("+n I - ,, ,, iHn-l’-l) 

A I o„ 1 I Cij — 2/,2/y I 


H‘ 


II. 


exp (—ia.Aj + oiviUiVs — dy dc, 

= 0 (f = 1, •• • , p). 


( 8 ) 



POWER FUNCTIONS 


281 


There is a unique real matrix 


T = 


^11 



txi 

in 



tip • 

1 _ 


(Ui > 0; zeros above the principal diagonal) 


( 10 ) 


such that [ctj] = TT'[2). Introducing the new variables Xi, • ■ ■ , x, by means 
of the transformation 

(9) [yi> ■■■ >yp] = [*1, • ■ • j a:p]T' 

with the Jacobian | T | = ( c.j |* we obtain the distribution 
fix, c)Jldxdc = 

■exp i^^CiijC,J “t“ “ ^Ci,f1)i1}j)TLdX(ic 

(fc = 1, • • • , p; ijt, = 0 when k > t). 

If io is any region, we write 

ffaivi «) = |3u.(jji, Vp , an, *xi 2 , • • •, ctpp) - I fix, c)Il dx dc, 

•'IP 

so that fiuiit, a) is the power function if w serves as a critical region for rejecting 
H", We have, symbolically, 

w = D X M»(c), 

where D is the set of points (c,,) for which [c*,] is positive definite and u)(c) is 
the cross section of w for fixed c.y. Then 

Jd 

■ f (1 - x<Xv)‘'""'~”e“"‘“*''’'n dx. 

•'U7(0) 

It is known [6] that, in order to have 
(11) /3«(0, a) = e 

for all a,,, it is necessary and sufficient that 

f (1 - x.x.)*‘"~'’~”ndx = Be, 

*»p(e) 

where B = f (1 — x.x,)‘^"~'’"^’n dx. 
■'*<*(£1 

The r^-teat is the test based on the critical region 

lOo: x,x, = c'^yty, = T^/il + T^) > const., or > const., 


( 12 ) 




282 


P. L. KSTJ 


where c'^ is the general element of [c,y]“* and is, except for a constant factor, 
Hotelling’s generalization of “Student’s” ratio. 

In order to e.stabiish an optimum property of analogous to that of E'^ given 
in Theorem 1, wo define, for any linear .set E and any region R in the sample 
space. 


\paiS) = f jSsC?;, a)n drj da. 


'i'a(E) does not nece,ssarily havo a finite value, and it is this fact which renders 
the following theorem less satisfactory than Theorem 1. 

Theorem 2. Le( be ihe smallest latent root of [c,j] and lei E be any subset 
of D in which pp is at least equal to a fixed positive constant. Of all the critical 
regions w which satisfy (11), the region Wa has the maximum 
In order to prove this theorem we need the following two lemmas. 

Lemma 1 If c is a positive constant, the, integral 


1 = 1 lc.vr'’’^'n, 


has a finite value. 

Proof. Let pi , ■ • • , p, be the latent roots of [c,,] in the descending order 
of magnitude. From a known theorem [1] wo get 


= c/ 

'ic 




Hence I is finite. 
Lemma 2. 


-P;)ndp 


<c f ■ • • f (i dp. ■ • ■ dp. 


(13) ypus) = E f lew ndc f (1 - Xt)'‘n dx 

and ypusiS) is finite, where gt depends only on k and S. 

Proof. Let A be the set of points iaij) for which [cu.j] is positive definite 
By (8), we have 

MS) = K [ lew - U dy dc f \ «wdH da, 

kfltfjf •'A 


where 


J = / exp (—iaijmvj + dy. 


There is a real non-singular matrix G = j^w] such that [a.'j] = GG'. Using the 
transformation 


[nit • ■ ■ , np]G = , 



POWEE PCJNCTIONS 


283 


whose Jacobian is | G 1 ^ = | «,■/1 \ we have 

J = \ r* f exp (-Htr.- + £';<r, 2 /j)n dir. 

•'fittta 

This is reducible by means of a rotation to 

J = I 1“^ / exp (-ir. T. + (aijy, 2 /y)Vi)II dr 

tS 


■“ 1 1 ^ S } 

k-0 


where 


^ 1 f a^ 1 r" r“ ^ ^ (ztt)*” 


and djt depends only on A: and <Si. Hence 


where 


f I J n da = S 4 7. , 

Ja fc“*0 

1(a., 

JA 




where 


fit) = f 11‘" n da = 1 C.y - 2ty. y, 

Ja 


—Kh+P+1) 




Hence 


where 


7^ — Cfc { Cij 


l—Un+p+D/u \h 




g.a»r(y.± | -'t ‘ +1) 


'» + P + 



284 


P. Ii. HSU 


Hence 


= K'td.e, f \c,j I Ci, - y,y, {c*’ y. y,Y n dy dc 

= E (?* f I c.; dc f (1 - rc(a:.-)‘‘""^"*’(a;ia;,)*n dx, 

k—a •'B Jia(e) 

where gk = Kidifik depends only on k and S 
Now 


f (1 — Xf (x, Xi)'‘U dx < f H dx, 

•' ui ( b ) JxiX{<l 

f |c,yr''+»>ndc < [ 


**XiX 

|-Kp+i) 


n dc 


**pp'^c>0 


is finite by l^mma 1. Hence 


4'uii(S) < const. J^dkek = const. ^ --^ 


and so VuiiiS) is finite. This proves Ivemma 2 . 

Proof of Theorem 2, Since \kwx{S) is expressible as (13) and is always finite, 
it follows from (12) and the Neyman-Peanson Tvemmn tliat ^u,jr((S) is maximum 
when w is Wo. This proves Theorem 2. 

Simaika [ 6 ] proved that of all the critical regions to which satisfy the conditions 

(a) /?u.( 0 , «) = « for all ot,j, 

(b) a) = f{ot,,rixVj), 

Wa is the uniformly most powerful one. Strangely enough, this result cannot 
be deduced as a consequence from our Theorem 2. 

The difficulty in dealing with the integral f„(iS) is that it is not always finite. 
In order to have a finite integral let u.s consider the following: 


FuifS, S ) = I e ^*1“'’ ( 3 ,^( 7 ), a) II drj da, 

'‘a 

where [6,,] is a positive definite matrix. As an immediate con,sequence of 
Simaika’s theorem we have 

(17) ru,(0, «) < ,S’) 

for any region w satisfying (a) and (b). Now the question arisi's whether 
(17) remains true if the condition (b) on w is removed. 'I’lie following theorem 
answers this question in the negative. 

Theorem 3. Lei [0,j] ie a positive definite matrix, [p,j] = [c,j + ' and 

\t , • • ■ ,'\p he the roots of the equation | Ci, — | = 0 There is a frinctiort 

g = g{\i, ■ ■ ,\p) such that the region 

wi: ^ , * ■ > ^p) 

satisfies (a) and has the maximum ra,(^, S) 



POWER FUNCTIONS 


285 


Proof. From (10) and (14) we obtain 
r„(e, S) = Kj^di, f \ c.j - y{i/j di/ dc 

kioO •'w 

*» A 

Comparing tiro inner integral with (15) and using (16) we got 

rue, S) = Eg, [ 1 c., + da 1 c*. - y<y, (p., y,y^ n dydc 
(18) = g g, 1 1 ca + 0r, H dc 

■f (1 -X. (y. 7 a;. *,)'= n dx, 


where ytjXiXj is the result of applying the transformation (9) on p,iy,y,. We 
shall show that, for every fixed set of , a unique number g ~ gi\i, ■ , Xp) 
exists such that the legion p„i/, 2 /, = y„x„v, > g satisfies (12), i.e. 


(19) f (1 - a;»a;.)*^"~'’”'^nda; = Bt. 

Since [y,]] = T'[c„ + Oif^T, the latent roots of [ 7 ,;] are X,/(l + X,) (i 
• • > , p). Hence by a rotation the equation (19) is reduced to 


= 1 , 


( 20 ) 


/, 


(X,/i+Xi)«i£ja0 


(1 - n dS = Be. 


As g increasoa from 0 onwards, the loft member of (20) decreases steadily from 
B to 0. Hence there is a unique g = ^(Xi, • ■ • , Xp) wliich satisfies (20). 

For this p(Xi, , Xp) the region w, satisfies (a). Hence, applying the 

Neyman-Pearson Lemma on (18) we obtain the result. 

From Theorem 3 we learn that there actually exist other exact tests for H" 
which have some optimum property not possessed by T^, viz., the tests based 
on the critical regions w, corresponding to various values of the 0 ,,. However, 
the great difficulty in numerical computation prohibits their application and the 
T“-test stands out as the only test ivhich is both simple and good. 


EEFEEENCES 

[1] P, L. Hsu, ‘‘On the distribution of roots of certain determinantal equations," Annala of 

Eugenics,Yol. 9 (1939), pp. 25P-258. 

[2] P. L Hsu, ‘‘An algebraic derivation of the distribution of leotangular coordinates," 

Proc Edin. Math. Soc ,Vol.6 (1940), pp 185-189 

[3] P. L. Hsu, "Analysis of variance from the power function standpoint,” Biomeinka, Vol. 

32 (1941), pp. 62-69 

[4] P.L.Hbu, ‘‘Canonical reduction of the general regression problem," Annals of Eugenics, 

Vol. 11 (1941), pp. 42-46. 



[5] J. Nbyman and E R. P)BAiif. 0 N, ''Contribution to the theory of testing statistical hypoth¬ 
eses,” Slat. Hi’S. Mfin , Vol. l(i (1936),I'p. 00-00 
[0] J, H. StMAiKA, “On an oiilimuin projierty of two iinportiint statistical tosts,’’ liioinetnKa, 
Vol.32(1911),pp 70-80 

(7] A. Wald, “On the power fiinotion of the analysis of variance tests," Annals of Math. 
Slat., Vol. 33 (1912), pp. 434-139. 



SOME GENERALIZATIONS OF THE THEORY OF CUMULATIVE SUMS 

OF RANDOM VARIABLES 

By Abraham Wald 
Columbia University 

1. Introduction. In a previous paper [1] the author dealt with the following 
problem: Let [Zi] {i = 1, 2, ■ ■ • , ad inf.) be a sequence of independently dis¬ 
tributed random variables each having the same distribution. Let a be a given 
positive constant, h a given negative constant and denote by n the smallest 
positive integer for which either 

(1) + • ’ * + > a 

or 

( 2 ) Si + + Zn <b 

holds. The main problems tieated in [1] were; (1) Derivation of the probability 
that the cumulative sum reaches the boundary a before the boundary b is reached; 

(2) Derivation of the characteristic function and the distribution function of n. 
In this paper we shall consider the following more general problem: Let K = 

[k,{si , • • • , 2 ,) 1 (i = 1( 2, • • , ad inf) be a given sequence of functions and let 
n be the smallest positive integer for which either 

(3) K{zi , • • •, 2») > 1 
or 

(4) K{zi , • • •, 2„) < -1 

holds. No restrictions are imposed on the sequence K except that it must be 
such that the probability that n < oo is equal to one. The purpose of this 
paper is to dm’ive some theorems concerning the probability that k„{zi, ■ • • , Zn) 
> 1 and concerning the expected value of n. Obviously, the problem formulated 
here is a generalization of that considered in [1], since the latter can be obtained 

2 d I ^ 

by putting ki(xi, ■■■ ,Xi) = (21 + • ■' + 2.) - . 

2. The conjugate distribution of z. Let 2 be a random variable whose dis¬ 
tribution is equal to the common distribution of 2 ,. In tliis section we shall 
introduce the notion of the conjugate distribution of 2 which will be used later. 
According to Lemma 2 in [1], under some weak restrictions on the distribution 
of 2 there exists exactly one real value /lo 5 ^ 0 such that 

(5) = 1 

where E(u) denotes the expected value of u for any random variable u. 

287 



288 


AHUAHAM WALD 


For simplicity wc shall assume that z has u continuous clistril)ution admitting 
a probability density oi'crywlicre, or that z has a discrete distribution. By the 
probability distribution /(z) of z wo shall mean the probability density of z if 
the distribution of z is continuou.s In the <liserete case /(z) will denote the 
pro!lability that the random I'uriahle takes the I’nlue z. From (5) it follows that 

( 6 ) nz) = 

ia a probability distribution. We shall call /*(z) the conjugate distribution of z. 
For any random varialilc u we slmll denote liy E*{u) the expected value of u 
under the assumption that the distribution of z is given liy /*(z). The,expected 
values E(u) and E*(u) may depend on the sequence K = (/o,(zi, ■ • , z,)) 
(z = 1, 2, • • • , ad inf.). Occasionally wc shall put this dependence in evidence 
by writing E(u | IQ and E*(u ( /C), respectively. 


3. Two theorems. In this section we shall derive two theorems. The first 
theorem ia concerned with tlie proiiability that h„(zj ., • ■ • , z,,) > 1 and the 
second theorem with the expected value of n. In what follows the operator Ei 
will mean conditional expected value under the I'cstiietion that hnizi, ' • • , zQ 
> 1 and Ei will mean conditional cxpoctecl value under the restriction that k„ 
(2i I • • • , z„) < —1. If the distribution of z is given by /*(z), these conditional 
expected values u'ill bo denoted by the operators E* and E* , respectn'ely. 

Theorem 1. Lei K = (ifc.(zi, • • • , z,)) he a sequence such that the prohahiliiy 
that n < 00 is equal to one under both dislnbulions J{z) and f*{z). Let y denote 
the probability that fcn(si, ' ■ • j * 1 .) ^ 1 ivlicn f(z) i.s the distnbulion of z, and let 
7 * denote the probability of the same event when f*{z) is the distribution of z. Then 


(7) 


F2(e'"'’'>lA) 

_ 1 - 7* 
1-7 

y 

and 




(8) 



_ 1 - r 
1-7* 


where Zn = Zi + ‘ + z„. 

Proof: From (6) it follows that 


(9) 

and 


pZ„l.o _ /*(zi) • ■ • f*M 

/(z,) ... f{z„) 


( 10 ) 


.—Znf‘0 _ /(^l) • ' • f(Zn) 


A set (zi, • • • , Zn) will bo said to bo of type 1 if and only if — 1 < km(zi, ■ ■ ■ , 

Zm) < 1 for wz = 1, • • • , ?i — 1 and fc„(zi, • ■ , Zn) > 1. Similarly a set (zi, 

■ ■ • z„) will be said to be of type 2 if and only if -1 < fc„(zi, ■ ■ ■ , Zm) < 1 for 

m = 1, • • ■ , n — 1 and kn(zi , • ■ • , Zn) < — 1. 



OUMULATIVK SUMS 


289 


We shall prove Theorem 1 under the assumption that the distribution of z is 
discrete Because of (9) we have 


( 11 ) 


= 


' /*(gl) • • f*(Zn) 

/(Zl) . . • /(z„) 




Z /*(2l) 

(^li' ^n)_ 

E M) 

(21. '.211' 


/*(g«) 


/(2n) 


where the summation is to be taken over all sets (zi, ■ ■ , z„) of type 1, But 

y* 

the last expression is obviously equal to — and, thercfoie, the liist equation in 

y 

(7) is proved. The second equation in (7) follows m the same manner if we take 
into account the fact that the probability that n. < 00 is equal to one. Similaily, 
equation (8) can be obtained from (10). The proof can easily be extended to 
the case when the distribution of z is continuous. Hence, Theorem 1 is pioved 
Theorem 2. If Ez ^ 0, the relatim 


( 12 ) 


E(n 1 K) = 


holds for any sequence K = jlc,(zi, • • • , 2 .)} for which one of the following two 
conditions is fulfilled: 

(a) There exists an integer N such that the probability that n < N is equal to one. 

(b) E{n I if) < 00 and the first four moments of z are finite. 

Proof: First we shall show that condition (a) implies the validity of (12) 
For any integer i wo shall denote zi fi- • • - fi- Z{ by Z^. Since the probability 
that n < N is equal to 1, we have 

(13) F(Z„ I K) + B(zn+i + • • + zh) = ETin = NEz. 


Since the conditional expected value of (z„+i + • • • + foi' a given value of 
n IS equal to {N — n^Ez, we have 

(14) F(z„+i + - ■ ■ + sv) = F(iV - ft I K)Ez = NEz ~ E{n \ K)Ez. 

Equation (12) follows from (13) and (14). 

Now we shall show that condition (b) implies (12). Denote by Pif the prob¬ 
ability that 11 < N, Let the operator Eti denote conditional expected value 
under the restriction that n < N, and let the operator E'ff denote conditional 
expected value under the restriction that n > N. Then we have 

(15) P>iEM + (1 - Pn)E'M = F(Zj,) = NEz. 


Since 

= Eif{Z„ I K) -p Ew{zn+i + • • • + 1 i^) 

(16) EM = Fjv(Z„ 1 K) + E^{N - « I K)Ez 
= E^{Zn I K) -b NEz - E^{n | K)Ez, 



290 


AHIIAHAM WALD 


we obtain from (15) 

(17) I K) + Nliz ~ Esin \ K)Ez] + (1 - Ps)E'siZs) = NEz. 
From E{n \K) < x it, follows that 

(18) lim (1 — PsW - 0. 


Now wo shall show that (LH) irni)lic.s the validity of 
( 19 ) lim (1 - Ps)EU'^‘^) = 0 


J.et Ts = Zn — NEz. IfeeauKO of (18), (19) i.s proved if we can show that 

(20) (i - PidEsiTs) = 0. 

Denotu by Rs the set of all points (zi, ■ • • , zs) for vvhie.h n > N, "I’hen the 
probability measure of Rn is eciual to 1 — Py and 


(21) (1 - Ps)E'siTs) = [ TsM) • • • fizs) dzy- dzs . 

fjct P], be the part of Rs in which Ts < -~N, R]f the pai't of P,v in which 2V > JV 
and Rif the part of Ry in which —N<Ti/< N Peemise of (18) we have 

(22) lim f Tsf(zi) • • • f(zs) dzi • • • rfzy { < lim (1 — P.v)fV = 0 , 

//tmoo I y 1 A^«»oO 

Denote the cumulative distribution function of Ts by EsiTs), CUeurly, 


(23) f, Tsfizi) • ■ • /(zv) dzi ■ ■ • dzs < r Ty dFy(Ty) < 4 r dFy(Ts). 

Jh iV'’ Jy 

tiinco the first four moments of z aio finite, the 4-tli moment of ~~~ converges 
to 3(7-* where o- is the .standard deviation of z. ITence 


VN 


(24) 


lim f 4 TUFy{Ty) = 3<J* 

^i»ao J-po iV^ 


From (23) and (24) it follows that 

(25) lim [2 Tyf(zi) • • • J{zy) dzi • ■ ■ dzy = 0. 

ynoOO “ 

Similarly we can prove that 

(26) lira [, Tyf(Zi) •• • J{zy) dzi ■ ■ dzy ~ 0. 

y=n, 0 p Jjt ^ 

Equation (20) follows from (21), (22), (25) and (20). Ileiico (19) is proved. 
From (17), (18) and (19) we obtain 

\{mPy{Ey{Zn\K) - Ey{n\K)Ez} = 0. 


(27) 



CUMULATIVE SUMS 


291 


Since Ez 9 ^ 0, lim = 1, lim B„(,n \ K) = E{:n. \ K) and Urn £^jv(Zn 1 K) = 
E(^n I K), equation (12) follov’s fiom (27) Hence condition (b) implies (12) 
and Theoiem 2 is proved. 

4. Lower limit of E(n | K), In this section we shall derive a lower limit i'oi 
A'(n I K) First we shall prove the following lemma 
Lemma 1. For any random vanahlc u we have 

(28) eJfC-) < Ec". 

Proof. Inequality (28) can be written as 

(29) 1 < Ee“' 

where u' = u — Ev. Lemma 1 is proved if we show that (29) holds lor any 
random variable u' whose mean is zeio. Expanding c“ in a Tayloi senes 
around u' = 0, we obtain 

0 “' = 1 + ^ where 0 < ^(u') < u'. 

A 

Hence 

Ee"' = 1 + > 1 

and Lemma 1 is proved 

Now we are able to prove the following theorem. 

Theorem 3. Let K ~ {K^{zl , • • • , ^.)i o, sequence of functions such that 
the probability that n < 00 is one under both distributions f{z) and f*{z) of z. Let 
y he the probability that Kn{zi, ' • ■ , Zn) > 1 when f{z) is the distribution of 2 , 
and let y* be the probability of the same event when f*{z) is the, distribution of z. 
Then 

(30) > j^^[Tlog^*+ (1 -r)Iog^*] 

and 

(31) E>{n\K) > + (1 - ^.) log 

provided that Ez and Ez* are not equal to zero. 

Proof: First we shall prove Theorem 3 in the case when there exists an integer 
N such that the probability that n < N is one According to Theoicm 2 wo hai7' 

(32) E{n I K) = ~ I If) + (1 - 'r)Ei{’L. \ K)]. 

From Lemma 1 and Theorem 1 it follows that 

ho Ei(Zn I K) < log ^ and hoEoifLo \ K) < log , 

y 1 — 7 


( 33 ) 



292 


VHRMIAM WALD 


From (32) and (33) we obtain 

hEzEin\K) = h[yEi{Z,,\K) 

+ (I - 7)Ss(Zn \K)]<y log 4 (1 - 7) log • 

Inequality (30) folloWH from (34) if wc eaiv show that hoE{z) < 0. From = 

1 and Lemma 1 it follo-ws that 1iqE{z) < 0. Bince ho ^ 0 and E{z) 9 ^ 0, we must 
have hoE{z) < 0. Hence (30) la proved. To prove (31) we i)roceod as follows: 
From Theorem 2 we obtain 

(35) -hoEz*E*{ 7 i\K) = ~ho[y*Et{Z„\K) + (1 - 7 *)F*(Z, | K)]. 

Fi’om Lemma 1 and Theorem 1 it follows that 

-ho[y*Et{Z,\K) 4 (1 - y*)Et{Z,\K)] 

(30) .V 1 - <v 

- y* ^ 4- (1 - 7*)log 

From (35) and (36) wc obtain 

(37) h E*(z)E*in \K) > y* log 1-* 4- (1 - 7 *) log . 

Since = 1 it follows from Ixsinma 1 that —hoE*z < 0. Inecpiality (31) 

follows from this and (37). Hence Theorem 3 is provctl in the special case when 
there exists an integer iV. such that the probability that n < N ih equal to one. 

To prove Theorem 3 in the general case, for any integer N let the sequence 
Ktf = , •' • , 2,)1 be defined aa follows: A :,^(^., • ■ • , z,) = k,{Zi , ■ • ■ , 2 ,) 

for / < N and fc,«( 2 i, • • , 2 ,) = 1 for i > N. Denote liy 7 jv and y^r the values 
of 7 and 7 *, respectively, if the .sequence K is replaced by Kk . Then we have 

(38) Ein 1 K) > E(n ] A'^) > [ 7 ,, log + (I - 7 v) log 

no/iiZ[_ 7jv 1 — 7/vJ 

and 

(39) E*in\K) > E*in\Ky) > ~^*[7^1og;^ + (1 - 7 ^) logf^]- 
Bmte lim 7 ;/ = 7 and lim yt = 7 *, ineiiualities (30) and (31) follow from (38) 

7^=aQ0 Af=00 

and (39). Hence the proof of Theorem 3 is completed. 

6 . Remarks added in proof. The results obtained in the present paper have 
obvious applications to sequential analysis. Those aiiplications are, however, 
not mentioned here, because at the time the present paper was submitted for 
publication, sequential analysis constituted classified material. In the mean¬ 
time, the material on sequential analy.sis has been released and was published, in 



CTTMULATIVE SUMS 


293 


this Journal, June, 1945. The results obtained in the present paper are more 
general than those obtained in connection with sequential analysis. Theorem 3, 
in the present paper, implied the efficiency of the sequential probability ratio 
test discussed in Section 4 7 of the paper on sequential tests. 

REFERENCE 

[1] A Wald, "On cumulative sums of random variables,” Annals of Math. iS'fot., Vol. 16» 
(1944), DP 283-296 



ON THE DESIGN OF EXPERIMENTS FOR WEIGHING AND MAKING 
OTHER TYPES OF MEASUREMENTS 

By K. Kikhkn 

Ikparlmmt of AgricuUurc, Lucknow, India 

1 . Introduction. In a recent paper, Hotelling [ 1 ] has (lisousaed the basic 

principles of the theory of the design of efficient exiH'riments for estimating the 
true unknown weights of p given objects by means of a specified number N of 
weighings, p < N m case the scale, is free fiom l)ia.s and p < — 1 if it has a 

bias the unknomi value of which has to be estimakd from tlie same data. He 
has emphasized the importance of these designs in other kinds of measurements 
besides weighing of objects and has called attention to the need for further 
niathematjcal rc.seareh for olitaining a "comprehensive' general solution." Such 
a solution has now been olitaiiied in case the number of weighings N is at our 
choice. Some othci general designs have also bec'ii gi\’en in this paper for 
specified value.s of N and p. 

2. Estiiuation. of unknown weights and efficiency of a design. Using 
Hotelling’s notation, we may write 

(1) = i: a;. J). 

|f»l 

where i = 1 , 2 , - < ■ ji, on the assumption that there is cither zero bias in the 
scale or the bios is known o pnori, and a = 1, 2, • > • Ah Eiva) is the expecta¬ 
tion of the ath weighing. For a bia-ssed scale, we may take t = 0, 1, 2, • ■ • p. 
The efficient estimate of each of the 1),’b has been derived by Hotelling by the 
method of least squares, It is of interest to obtain these estimates by the use 
of the theory of linear estimation as developed by Bose [2] and Rao [3]. 

Assuming that Vi, Vi, •• Vn are N stochastic variates forming a multi¬ 
variate normal system with the variance and covariance matrix gi^"en by 

(2) u = [u,j\, 

it follows from Rao’s generalization of Markoff’s theorem t.hat the best unbiassed 
estimates of the Si’s are given by the solutions of the normal equations 

(3) X'lr^XB' = X'XT^Y', 

where H = [tibj ■ • • f)J and Y - [yiyi ‘ yn], and li' and Y' denote as usual 
the transpose of the row' vectors B and Y, i.e. column vectors. 

Ill the present ca.se, the assumption is that all the N stochastic variates aic 
uncorrelated and have a common variance a^, so that 

(4) 

<r 


294 



UWSIGN OF EXI’EHIMENTR 


295 


Hence the normal etiuatioas in (3) reduce to 

(5) X'XB' = X'Y', 

which are exactly the siune as the normal equations given by Hotelling, since 

(6) X'X = [a.,] 
where a.j = Six^aXja) 

Let C = [c.;] denote the reciprocal of the matiix X'X, so that V{h,) = c„d-^ 
and cov (bJ}j) = c.jtr* Then the mean variance of the p unknowns for a design 
is given by 


(7) 


2 JV E Co 
Z- jEl 

N ' p ' 


If the maul object of the experiment is to estimate the unknowns with the 
least variance, the most efficient design (for a specified value of N) would be 
the one for which the miviimm rnimmonm of (E/N is attained for all the p 

P 

unknowns so that the iiu'au v'ariance m this case is cr^/N 'Phe factor, N E c-u/v, 

i-i 

on the light-hand side of (7), therefore, measures the increase in variance result¬ 
ing from the adoption ot any design other than the most efficient design. Its 
V 

reciprocal, jy ^ appropriately be defined as the (‘ffi-cientj/ of a given 

i-l 

design for providing estimates of the p unknowns This iiuaiitity w’ill now be 
utilized for judging the relative precision of the general designs discussed in the 
subsequent paragraphs. 


3. Design for N = 2", p < 2” (zero bias) or p < 2’'* — L (non-zero bias). 
By utilizing the properties of a 2-sided m-fold completely ovthogonalized Hyper- 
Graeco-Latin hyper-cube of the first order introduced by the author [4], it is 
easy to see that for N = 2"', p < 2”' (when theio is zero bias) or p < 2’" — I 
(when there is bias), m being any positive integer, a completely orthogonalized 
design can be constructed w'lth each unknown weight estimated with the mini¬ 
mum variance ifi/N As remarked by Hotelling in the case of N = 4, p = 4 
(foi zero bias) or p = li (if theie is bias), the matrix X'X for this de.sign is a 
scalar matrix of order p X p if there is zero bias, oi of order (p 4- 1) X (p 1) 
if there is bias, each of the diagonal elements being N. The icciprocal matrix 
is also a .scalar matrix in which each of the diagonal elements is 1 /N so that the 
(.'stimates of all the unknowns are mutually orthogonal. 

As a particular case of this general design, we may take N = 16, p = 16 (for 
zero bias) or p = 15 (if there is bias), the completely orthogonalized design for 
which is represented by the matrix 



296 K. KIHHEN 

(8) X = 


f 

1 

1 

I 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

-1 

1 

1 

1 

-1 

-1 

-1 

1 

1 

1 

-1 

--1 

-1 

1 

-1 

1 

1 

-1 

1 

I 

-1 

1 

1 

-1 

-1 

1 

-1 

-1 

1 

~1 

-1 

1 

-1 

-1 

1 

1 

1 

-1 

-1 

-1 

-1 

1 

1 

1 

-1 

-1 

1 

1 

1 

1 

-1 

1 

1 

-1 

1 

-1 

1 

~1 

-1 

1 

-1 

-1 

-1 

1 

-1 

1 

-1 

1 

-1 

1 

-1 

-1 

L 

-] 

1 

-1 

1 


1 

1 

1 

-1 

-1 

1 

-1 

-1 

1 

1 

-1 

-1 

1 

-1 

-1 

1 

1 

1 

-1 

-1 

-1 

1 

1 

1 

-1 

1 

-1 

-1 

-1 

1 

1 

1 

-1 

1 

1 

1 

1 

-1 

I 

1 

-1 

1 

-1 

-1 

1 

-1 

-1 

-1 

-1 

1 

-1 

1 

1 

-1 

-1 

-1 

1 

1 

-1 

-1 

-1 

1 

1 

-1 

1 

1 

1 

-1 

1 

-1 

-1 

1 

-1 

-1 

1 

-1 

-1 

1 

-1 

1 

1 

1 

-1 

-1 

1 

-1 

1 

-1 

1 

-1 

1 

-1 

] 

-1 

1 

1 

-1 

1 

1 

1 

-1 

-1 

1 

-1 

-1 

-1 

-1 

1 

-1 

-1 

1 

1 

1 

1 

-1 

L 

-1 

-1 

-1 

1 

1 

-1 

-1 

1 

1 

1 

-1 

1 

-1 

1 

1 

-1 

-1 

-1 

-1 

-1 

-1 

1 

1 

1 

1 

1 

1 

-1 

-1 

ll 


-1 

-1 

-1 

1 

1 

1 

1 

1 

1 

~1 

-1 

-1 

-1 

1 


for which X'X is ii Hcalar inalrix of ordcn 16 X 16, eacli duiROiiiil clement being 
16. Again, a completely orthogonalized design for A'’ = 16, p < 16 (for zero 
bias) or p < 15 (if there is bias) is rcpreHcnted by a matrix X obtained from the 
matrix in ( 8 ) by omitting any 16 — p of its columns if there is zero bias, or 
16 — p — 1 of its columns if there is bias. In the matrix X, jxirmutation of 
rows and columns is permissible and each sucli matrix represents a completely 
orthogonalized design. 

For the design given by Hotelling^ for Af = 4, p = 3 (zero bias), tlie efficiency 
is 35 per cent. The completely orthogonalized design for which the efficiency 
is 100 per cent is represented by the matiix 


(9) 



1 

1 

-1 

-1 


1 ] 

-1 

1 

-1 


4. First design for Af = 2"' + 1, p < 2"‘ (zero bias) or p < 2'" — 1 (non-zero 
bias). For N = 2 ” d- 1 , p < 2 '" (zero bias) or p ^ 2 "* — 1 (if there is bias), 
m being any positii'o integer, iirobably the most efficient design available seems 
to be that represented by the matrix X obtained fi’om the coi lesponding matrix 

‘ Tho allusions hero and at the end of the next seotion aro to dosiRns on p. 306 of the 
llotolling paper [1], a passage coiioerncd with designs aubjoct to tho restriction that the 
entries on the matrix be O’s and +l’s only, as is necessary in many types of measurement. 
Tlie more eHloiciit designs given above, whose matrices involve —I’s also, can be used only 
in such oases as that of weighing in a balance, whore the objects under investigation can be 
put, some in one pan and some in the other. Such situations are consulorod m a differeni 
part of Holplling’s papei. 



DESIGN OF EXPEKIMENTS 


297 


X for the general design of Section 3 above by adding a row 1, 1, ■ ■ 
The matrix X'X for this design then comes out as 


( 10 ) 



1 1 ••• l' 

N 1 1 

1 N ■■■ 1 


1 l-'-JVj 


1 to it. 


which is a symmetrical matrix of order p X ?> if there is zero bias, or of order 
(p + 1) X (p + 1) if there is bias. The variance of each unknown for this 
design is 


( 11 ) 


N - 


P - 1 
N+p-2 


or 

_^_ 

(12) ^ _ V 

AT + p - 1 

Thus the efficiency of this design is 


(13) 


I - PjzJ: - 

NiN + p - 2) 


for zero bias, 


if there is bias. 


for zero bias, 


or 


(14) 1 - if there is bias. 

The loss of efficiency resulting from the adoption of this design is, therefore, 

jv(/+V- 2) ““ ™ mif + p-T) 

As a particular case of this, for 2V = 5, p = 2 (zero bias), probably the most 
efficient design available is specified by 


(15) 



5ff 

The variance of each unknown in this case is — and the efficiency of the design 

24 

is 96 per cent. For the design given by Hotelling for this case, the variance of 
4cr" 

each unknown is ~ and the efficiency is 35 per cent. It would thus appear 




298 


K. KISUFN 


that, iw judgfd by tlic critc.iinii of (‘flkdeney ji.s (kdiiicd lion', the denign repre- 
sc'idoil liy the niidvix in (15) k more ('ffic.iout tlian irotolliiig’.s dowgii 


6. Second design for jV = 2"' -H 1, p < 2“ (zero bias) or ;; < 2'" - 1 (non¬ 
zero bias). Another iiitei'i'stiiiK design for these, A'alues of N and p is that 
!('))resented by the matrix A' obtained by adding a row 1. (),•■• 0 to (he cor¬ 
responding matrix X for t he general de.sign in Section 8 ahm-e Tlie iniitnx A'A' 
for this design is then the diagonal inatiix 


( 10 ) 


A'hV = 


'A’’ 0 0 

0 A^ - 1 • • 0 


(0 0 


of order p X V (for zero bias) oi (p -h 1) X (p + 1) (for non-zero bias). As 
the reciprocal of this matrix is also a diagonal matrix, tlie estimates of all the 
unknowns are mutually orthogonal. The cfficioney of j.hi.s design is 


(17) 


(^-2)p 

Afp — 1 


for zero bias. 


or 

(18) 


N_- 1 
“ AT" 


for non-zero bias 


By comparing the efficiency of the first design given in (J3) and (14) with that 
of the second design in (17) and (18) respectively, it would appear that the 
efficiency of the first design is always higher than that of the second design for 
non-zero bias, and is also higher in the case of zero bios for p > 1, hut equal for 
p = 1. 


6. First design for A" = 2’" -|- r, p < 2’” (for zero bias) or p < 2"' — 1 (for 
non-zero bias). For At = 2” -f r, p < 2" (for zero bias) or p < 2" — 1 (for 
non-zero bias), m being any positive integer and r any positive integer < 2"', 
a highly efficient design is represented by the matrix X obtained from the 
corresponding matrix X for the general design in Section 3 above by adding r 
rows 1, 1, ■ • ■ 1 to it. The matrix X'X for these de.sigiis then comes out a.s 


(19) 



N 

r 

r • ■ ■ 

, j. 


t 

N 

r • ■ 

r 

X'X = 

r 

r 

At 

• r 


r 

T 

r * • ■ 

• N 







DESIGN OF EXPERIMENTS 


299 



7. Second design for AT = 2™ + r, p < 2"* (for zero bias) or p < 2” — 1 (for 
non-zero bias). Another design for these values of N and p is that represented 
by the matrix X obtained from the corresponding matrix X for the general 
design in Section 3 above by adding to it r rows 1, 0, 0, • • ■ 0. The matrix X'X 
for this design is then given by 




(n 

0 

0 • 

■ • 0 



0 

N - r 

0 

■ ■ 0 

(24) 

X'X = 

0 

0 

N -r ■■ 

•• 0 



10 

0 

0 • 

• • AT ~ 


which is of order p X P if there is zero bias, or of order (p + 1) X (p + 1) if 
there is bias. Here also the estimates of all the unknowns are mutually orthog¬ 
onal. The efficiency of the design comes out to be 


(25) 

(JV — r)p 

Np — t 

if there is zero bias, 

or 



(26) 

N ~r 

N 

if there is bias. 




300 


K. KISHKN 


By comparing the efficiency of the first design of this typo given in (22) and 
(23) v'itli that of the piesent design given in (25) and (26) respectively, it would 
appear that in case of zero bias, the efficiency of the first design is higher than 
that of the second design for p > 1, but equal for p = 1; and in case of non¬ 
zero bias, the efficiency of the first design is always higher than that of the 
second, 

8. Comprehensive general design when N is at our choice. When iV is at 
our choice, we can always obtain a completely orthogonalized design by taking 
W equal to a sufficiently large power of 2. For p -T,in being any positive 
integer, a completely orthogonalized design for N = 2", when there is zero 
bias, has been given in Section 3 above. If, however, there is a bias, a com- 
pletely oithogonalized design can be constructed for W = When p = 
2"* -I- u, where u is a positive integer < 2"', a completely orthogonalized design 
is available for N = whether the bias is zero or not. 

For W - 2” tliis is the most efficient design, with 100 per cent efficiency, 
but as W is given higher powers of 2 than the variance of the estimate of 
each unknown decreases, When W = 2*, wlicie 1 > m + 1, the variance of 

each unbown is of that for N = 

REFERENCES 

[1] Habold Hotelling, "Some iraprovemonts in weighing and other experimental tech- 

niquea," AnnaU of Math. Slat., Vol. 15 (1944), pp, 297-306. 

[2] R. C. Bobb, "The fundamental theorem of linear estimation," Proceedings of the Thirty' 

first Indian Science Congress, 1944, Part III. 

[81 C. Radhakribhna Rao, "On linear estimation and testing of hypotheses," Current 
Science, Vol. 13 (1944), pp. 164-165. 

[4] K. Kibiien, "On Latin and hyper-Graeco-Latin cubes and hyper-cubes," Current 
Science, Vol. 11 (1942), pp. 98-99 



NOTES 

This section is devoted to brief research and expository articles,notes onmethodology 
and other short items. 


NOTE ON THE LAW OF LARGE NUMBERS AND “FAIR” GAMES 

By W. Fellee 
Cornell University 

1. “Fair" games. Let {X*) be a sequence of independent random variables 
with the same cumulative distribution function Vfx). Suppose that the eX' 
pectation 

(1) E(Xh) = [^ X dV(x) = M 
exists, and put 

(2) = Xi + . •. + . 

The weak law of large numbers states^ that for every e > 0 and n-* oo 

(3) Pr (I < m}->1. 

In the picturesque language of the theory of games this means that, after a 
large number of trials, the accumulated gain S„ will, with great probability, be 
of the order of magnitude of nM. This led to the definition that a game is 
“fair” if the entrance fee for each trial is M. Unfortunately this definition 
creates the erroneous notion that a “fair” game is necessarily fair. To disprove 
it we shall (section 3) exhibit an example which will show: 

(I)'A ffafne can be “fair” and nevertheless such that the probability tends to one 
that, after n trials, the player will have sustained a loss Ln = nM — Sn of the order 
of magnitude n(log n)~’’, where -q > 0 is arbitrarily small. In other words, in our 
example 

(4) Pr {riM — > (1 — €)n(log n)”’’} —> 1. 

Of course, L,, is necessaiily of smaller order of magnitude than n; however, our 
example can be modified in such a way that the ratio of the loss Z/„ to the ac¬ 
cumulated entrance fees riM decreases as slowly as one pleases. 

This shows that a “fair” game can be exceedingly disadvantageous Con¬ 
versely, an “unfair” game can very well be advantageous. If a careful driver 
insures his car, the game is clearly “unfair” according to definition, and yet some 

1 Usually (3) is proved only under more restrictive hypotheses Actually the finiteness 
of implies even the strong law of large numbers; cf Kolmogoroff, Grundhegnffe der 
Wahrscheinlichkeitarechnung (Berlin 1933), p 69 

301 



shitew impoHt' .such games on clriecrs. N’ow in tlii.s and many other practical 
cases the game is of such a nature that there, is a very small piohahihtv p oi 
winning a coniparativcly great amount A; the “fair” price -wovild be pA. In 
such cjises the law of large' miinhers would lx; significant only if a is large com¬ 
pared to 1 /p, whereas actuall 5 r the maximum number of games to be played is 
comparati\’('ly small. Clearly any theory meets practical retjuirements only 
if it makes allowance for tlie number of trials and makes tlio “fair” price depend 
on the number of trials. 

2. The Petersburg "paradox.” h'or obvious reasons the classical theory of 
probability was unable to provide a precise' formulation of the law of large 
numbers and to establish the ai'tual conditions of it.s validity. Often it has 
been looked upon a.s a direct consequence of the. definition of probability, and 
this led to the. so-called Peteisburg paradox whicli presents no difFiculties to the 
modem theory. It icfers to the ciuse where, the expo,citation (1) is infinite. The 
usual example exhibits a game in which the possible gains in each Inal are 
distributed according to 

(5) Pr |X = 2‘} =2"' 

Here M = «. Now' ttio law of large numbers (:i) uscid to he proved (if at all) 
only assuming the existence of moments of Iiiglwir order, Nevertheless,' the 
classical theory postulated the validity of (3) even for M = , and treating m 

a.s a number (with oo — x = O) it argued that w is a “fair” price for the game 
as defined in (S). Great' ingenuity was cxereiised in order to reconcile this 
result witli commoiisense.^ Actually one can pass from (3) to the limit M —> oo, 
but the only result to be arrived at is trivial and could he anticipated without 
theory: If the player pays for each trial h fixed amount .■!, ho is likely to have a 
positive gain provided he plays sufficiently long, i.e,, ])rovided n > NiA), 
where N{A) itself mcrcam with A 

Instead of a paradox we reach the conclusion that the lu'ice should dei:)end on 
n, that is to say vary as the number of trials increases. For best results this 
should be the case even if M is finite. It .should he noticed that in the Petersburg 
case (5) a variable price can ho determined .so that a law of large numbers w'ill 
hold which is in every respect analogous to (3), In this formula nM is simply 
the accumulated amount of entrance fee-s; denoting it by P„ , formula (3) takes 
oil the equivalent form 

’Among UiG Intcat toxtbooka, von Mlaos {WahrvchBmUchKvitarechinnig, Loipzig-Wien 
1931, p. 108f.) avoids tho difTiculty by declaring that (6) can not reprosout a colloctit because 
of its infinite tail, This viewpoint is legitimate, but makes tho law of largo numbers inap¬ 
plicable to practieally all useful distributions Fry {Probahilily and Ua Engineering Uses, 
Now York, 1928, p. 197) says: "The true explanation of the paradox is , . baaed upon the 
fact that in our every-day experience W'e have to deal only with, individuals W'ho have finite 
fortunes and who would therefore be incapable of paying back the auras which are required 

. Tho problem does not seem to be mentioned in Uspensky’s book 



fjA.W OF LARGE NUMBERS 


303 


(0) Pr {iS„-P„| < ->1, 

It is this interpretation of (3) that leads to tlie notion of “fair” games Now 
the Petersburg game can also he played in a “fair” way: 

(II) the player in the Petersburg game (5) at the k-lh trial pay the amount" 
logj k. The accumulated entrance fees up to the n-th trial are n log 2 n, 

and the game ts “fair" in the. sense, that the late of large numbers (6) holds This 
requirement determines the entrance fees essentially uniquely (that is to say up to 
terms of smaller orclei of magnitude which, by definition, remain undetermined) 


3. Proofs. Theoiems (I) and (II) follow easily from the following 
Lemma: Lei an °o he a sequence of posiiiue numbers; m order that there exist 
a sequence (b„} such that 

(7) Pr (I*?,. - b„| < ean] 1 

it is necessai y and sufficient that for every 5 > 0 simultaneously 



in this case (8) mil hold with 


(9) 


Je-1 ^'l: 


*|<efc 


a; dV(x) 


{and, of course, for any other sequence {b*] if and only if {bt — bn \ = 0(a„)). 
This lemma is a simple consequence of the necessary and sufficient conditions 
for the generalized law of large numbers’. 

To prove theorem (II) we have to determine a sequence |a„) such that (7) 
will hold for the distribution function defined in (6) and with ~ o„ . A simple 
computation shows that (8) will hold for any sequence {a„} which increases 
faster than n. Moreover, the sequence (&„} defined by (9) will be of the same 
order of magnitude as {a„j if, and only if, o„ ~ n logs n. This proves (II) 

Now let > 0 be arbitrary, and define the distribution function V{x) to have 
a density 


( 10 ) 


V'ix) 


_ 5 __ 

a;* log’^’ X 


for X > e; 


at a: = 0 the function V{x) shall have a jump of magnitude 


( 11 ) 


1 


/"" ri dx 

L x^ log’'’’” X 


< 1 , 


while y(a;) is constant in the intervals a: < 0 and 0 < a; < e. For this distribu¬ 
tion function we have obviously M = 1. 


® Logs stands for the logarithm to the basis 2 
* Cf. Fellbk, Acta Untu, Sseged, Vol. 8 (1937), pp. 191-201 



(JKKHAIID TINTN'KH • 


:«4 

Xoxfc, let for n > c 

(12) a„ = u n. 

Then (8) hokls and from (9) and (10) we obtain eiisily for larRe n 

n 

(13) - £ {1 - log’"’a*) < n — (1 - e)a„ . 

rtubstituting into (7) one aera that, again for .sufficiently large ri, 

(14) Pr (»S’„ - n + (1 - t)fl„ < —> 1, 

or, since M = 1, 

(15) Pr {&'„ - nAf < -(1 - 2f)fl„l -> 1. 

I'his proves (I). 


A NOTE ON RANK, MULTICOLLINEAEITY AND MULTIPLE 

REGRESSION’ 

JlV nKHHAIlD TiNTNFH 
Ima State College 

Dit Xii{i = 1, 2 • • • M) be sot of M random varial)les, each being ob.s(>rv'ed at 
< = 1, 2 ■ • ■ A'’. Xii = M{t + j/u. (This is essentially the situation envisaged 
by Frisch [1]). The systematic part of our variables A/,* = XXn. The y,, are 
normally distributed with means zero. Their ^nlr^ancc;^ and covariances are 
independent of t. Tho Mti and are independent of each other Define 
X( = ^tX,i/N the arithmetic mean of Xu and xu — A", < — Xi the deviation from 
the mean, Then a,j = 7itXuX]i/(N — 1) gives the variances and covarmnees 
of the ob.servations, Wo want to determine the rank of the matrix of the 
variances and covariances of Mu. 

Now assume that ||Poll is an estimate of the variancc-covariance matrix of the 
error terms or “disturbances” yu , Tlie elements of this matrix- are distributed 
according to the Wwhait distribution and are independent of the Mu . They 
can he estimated as deviations from polynomial trends, as deviations from 
Ibiurier series, liy tlie '\''ariate Difference Method, etc. Tlie estimates could also 
be based upon a priori knowledge if for instance tho iju are interpreted as errors 
of measurement. Assume flial the estimate is based upon N' observations. 

'Tho author Ih uuicli ohIiKod to PioCcHaoi'H W. (1. Coclimn (Iowa Htato CJollogo), H. 
IIotQllmg (Oolumhia UiiiverHily), T. JCooptnauH (IhiivcrHity of Cliicago) and A. Wald 
(Columbia IIuivorHily) for advice and eiitieiam with this paper He has nlao profited by 
reading the unpublished paper; “On tho Validity of an Kstiinatc from a Multiple Ilcgrcasion 
Rqualioii” by P. V Waiigli and Tt, 0. Been whieli deals iii jiart willi a prolilem related to 
the one pre.senled lieie 

Journal Paper No J-1323 of tlio lo^a ARricuUural Kxporimont Station, Axnea, Iowa Project No. 730. 



A NOTK ON HANK 


305 


Porm the dctcrminiinta] cciuation: 

(1) I a^s - XF.j 1 = 0. 

Apai’t from Kiimpling fluctuations there should be r solutions X = 1 of equation 

(1) if there are r independent linear relationships between the M,(. The rank 
of the variance-covariance matrix of M,t is then M — r. Following a suggestion 
of P. L Hsu [2] made on the basis of the earlier work of R. A; Fisher [3] we form 
the test function 

(2) Ar = (A — 1) (Xi -)- Xj • ■ ■ -j- Xf), 

where Xi is the smallest root of (1)^ X 2 the next smallest, etc. Hence (2) is the 
sum of the r smallest roots of equation (1). The hypothesis to be tested is that 
there are exactly r independent linear relationships between the systematic 
parts of our variables in the population. This quantity (2) is distributed like 
with r(A — M — 1 -f r) degrees of freedom for large samples, i.e. if A'be¬ 
comes large. It can be used for forming an opinion about the number of inde¬ 
pendent relationships existing among the systematic parts of our variables 
The importance of the question of the rank lies in the following: Sometimes 
we are not so much interested in making predictions as to estimate the “true” 
relationships which exist in the population which corresponds to our sample 
(Wald) [4]. Practically speaking, these relationships and their estimation are 
of great importance in economic statistics, as Haavelmo has shown [5]. But a 
knowledge of the rank i.e. the number of independent relationships existing be¬ 
tween the systematic parts of the variables may also be of some significance for 
the problem of prediction. The inclusion of strongly correlated predictors 
cuts down on the number of degiees of freedom ivithoiit contributing significantly 
to the reduction of the variance. 

The remainder of this paper ivill be concerned with an attempt to estimate 
the relationships which in the population exist between the systematic parts 
of the variables. This is an e.xtension of the work of T. Koopmans [6] and the 
author [7] who dealt with the special case in which there is only one relationship 
between the systematic parts. 

Suppose that we decide that there are R independent relationships among the 
systematic parts of our variables 

(3) A:«o + £ = /„( = 0; p = 1, 2, ■ ■ ■, R, t = 1, 2, • ■ ■, N. 

3 

We desire to obtain estimates of these relationships. Our purpose here is not 
prediction but estimation of the structural coefficients . 

The methoil of maximum likelihood leads to the method of least squares if we 
treat the F,, as constants This is agam permissible if A' is large and our esti¬ 
mates of the F ,3 lieeomc' rea.<onab]y accurate We have to minimize the follow¬ 
ing sum of sciuares 

(4) Q = E Q/ 



306 


GKHIIAUD TINTNKK 


where 

( 5 ) ““ - lUit), 

i ) 

where H |j = (| F,i || the inverse of flu* variauce-cin’iii'iimce matiix of the 
errora. Wc also definem,i = il/.i — Jlif,, (t ~ 2, ■ • , .V) where M, is the 

mean ol Mu. 

If there arc li relatioimliiiiH (3) they can he written hy usiiif!; only li{M ~ R) 
coefficientfl ktjij = 1, 2 ■ • • M), if we diHregard tlu* eonHtanf tenns , beeanse 
we are now dealing with deviationn from means. AVe (‘an for instance express 
the first {M — li) variable's mu in tonns of the, Iasi R viuiables m,i , Heuco, 
we have to imposo /£“ conditions upon the Mli coefiieient.'« k,.,(j = 1, 2, • ■ ■, M) 
apirearing in (3). 

We impose l{(li + l)/2 conditions as follows 

( 6 ) J “ ffl/Uf ~ 3 

where Sou ia a Kronecker delta. These conditions orthogonalize and normalize 
the coefficients koj. We have now to adjust the Qt as given in (5) under the 
conditions (G) by determining appropriate wi.i. 'Fids i.s a problem of re¬ 
stricted minima, 

We introduce a new function 

(7) F, ^Q, - EmU, 

V 

where the u.i are Iiagraiig(* multipliora. Differentiating with respect to mu and 
setting eciual to zero we get the solution: 

(8) £ ~ mM = £ Pv, fcn ; (1 = 1, 2, • • •, Af); 

7 V 

or, solving for a:** “ mu 

(9) Xft "" fTiit ~ j i- ~ If • f Afi 

V / 

Multiplying (9) by Ki and summing we get 

(10) ^vi ” - 

Henco we have 

(11) = £ mSi - £ (£ 

Now we dispose of the remaining ItiR — l)/2 conditions 

( 12 ) E - 0 , V 9^ w. 

t 

We have to maximize Q under the conditions (6) and (12). This is done 
by finding the appropriate . 

We form a new expression 



A NOa’K ON RANK 


307 


(^3) Cr Su(Xuuii/fluj 

where the a„„ and j3i.„ (« w) are again Lagrange multiphera and = 0. 
Because of conaideratioris of symmetry vve have: a„„ = and /3„,„ = ie„. 
Differentiating witli rcHpcct to k„ anil setting equal to zero we get the condition 

2Jnj 

(14) 

2^ 0:^,0 Sy Vt]kuij t a = 1( 2j ' • ■ , f?, f = 1, 2, • • ■, ilf. 

Multiplying by k„, and summing we get 

(15) = avv 

Multiplying by k^r (z 5 >^ v) and summing we have 

(16) ~ 0tv2 {v 7^ z). 

Both (15) and (IG) follow from conditions (6) and (12) 

Exchanging the role of v and z in (16) we have also 

(17) ^vz'Stftl, = (v z)- 

Hence wc have = 0, if « ^ w Inserting these results in (14) We get a 

system of linear and homogeneous equations in the unknown coefficients kv, , 
The determinant of the system must be equal to zero in order to yield non-trivial 
solutions. Trivial solutions are not admitted because of (6) Hence the 
are simply the roots k of the equation | Z,x,tXjt — kVt, | = 0. 

Introducing 

(18) X„ = a„/{N - 1), 

expression (14) becomes actually the determinantal equation (1). This expres¬ 
sion can be used to find the R smallest latent roots X„ and the corresponding 
characteristic vectors k^, by Hotelling's methods [8] 

The constants of the equation (3) are finally determined by the condition 
that the optimum solutions have to go through the means of the variables 

(19) ho -f = 0. 

The distribution of the variances and covariances of the obseivations has recently 
been established by 'f. W. Anderson and M. A Girshick for the cases R = 
M - 1 and 12 = M - 2 [9] 


REFERENCES 

[IJ R Frisch. Statistical Confluence Analysis by Means of Complete Regression Systems, 
Oslo, 1934 

[2] P L. Hsu. "On the problem of rank and the limiting distribution of Fisher’s test func¬ 
tion,” Annals of Eugenics, Vol. 11 (1941), pp 39, ft 
|31 R A. Fisher “The statistical utilization of multiple measurements,” Annals of Eu¬ 
genics, Vol 8 (1938), pp 376 ff. 



308 


■wilmam: q. madow 


[4] A. Wald: “The fitting of straight lines if both variables are subject to error," Annals 
0 / Malh. Stat., Vol. 11 (1940), pp. 284 R. 

[6] T. Haavelmo: “The probability approach in econometrics," Economelnca, Vol. 12 

(1944), Supplement. 

|6] T. Koopmans; Linear Regression Analysis tn Economic Time Scries, Haarlem, 1937 

[7] G. TiNTNEn: “An application of the variate difference method to multiple regression " 

Econometrica, Vol. 12 (1944), pp. 97 ff. 

[8] H. Hotbluno; “Simplified calculation of principal components,” Psychotnetrica, Vol 1 

(1036), pp. 27 R. 

(91 T. W. Anderson and M. A. Girbiiick: "Some extensions of the Wishart distribution," 
Annals of Math. Slat., Vol. 16 (1944), pp. 364 R. 


NOTE ON THE DISTRIBUTION OF THE SERIAL 
CORRELATION COEFFICIENT^ 

By William G. Madow 
Bureau of Ihc Census 

The distribution of the serial correlation coeffioient when p = 0 has been 
previously obtained.^ The purpose of this note is to derive the distribution of 
the serial correlation coefficient, using the circular definition, when p 0. 

Let us assume that the random variables * 1 , • • • , have a joint normal 
distribution’ p(xi, • • ■ , xa- | A, /i, p) where 

jOg p{xi ,>•’ ,xt,\ A, B, p) 

£ (a;. — 

the term in the bracket is positive definite, Ki is independent of the xi and if 
t + L > JV then x,+l = x,+t_jv. It is then clear that x, Vk , and lCic , where 
£ is the arithmetic mean, Tat = and 

lCk — 2 (x< — x)(X(4.2, — x) 

i 

are sufficient statistics with respect to the estimation of p. A, and B. 

Let Ta- oRif = hCif define i.Bjy', the serial correlation coefficient. Then if 


* Preaented at a meeting of the Cowles Commission for Economic llesoarch in Chicago, 
January 31,1946, 

* See R. L. Anderson, "EisCribulion of the aerial correlation coeficient", pp. 1-13 and T. 
Koopmans, "Serial correlation and quadratic forms m normal variables”, pp. 14-33, Annals 
of Malh. Slat., Vol. XIII, No 1, March, 1942. 

* The expression p(ti , • • ■ , fm I St , • ■ ■ , Si;) means tho probability density or the 
distribution of the random variables fi , ■ • • , fm for the given values of the parameters 

, 6, When used as an index of summation or multiplication, the letter 1 will 
assume all values from 1 through N. 


= log Ki - i ^ (x, - uf ^ 2B 



SERIAL COHBELATIOI^ COEFFICIENT 


309 


A = 1, B = 0 Anderson has shown* that, if 2V is odd, the joint distribution of 
iRif and Fjv is given by 

(1) D{R„, 7:.) = £ (X. - for K+i < 12;. < Xm 

i-i 

where 

12;. = iRn , x* = cos a< = n (X< — Xy),' for all j i 

A y-i 

and TjKA' — 3)]; while if JV is even, the same formula holds except 

that 

l(V-S) 

a. = n (x< - Xy) V(X. + 1), for all j i. 

j-i 


We now extend Anderson’s distributions to the case where it is not assumed 
that A = 1 and B = 0. ^ 

As a means of extending' Anderson’s distribution let us recall that if Xi, • • ■ , 
Xtf have a distribution p(xi i • • • , a:a [ , • • • , 9,) depending on several param¬ 

eters 6i, ■ ‘ , dg , and if , • • • , a* are a sufficient set of statistics with respect 
to 01 , , 0g, i.e. 


p(x xy \ 01 0g) = h(zi , • • • , Zfc 1 , • • • , 0g)m(xi , • ■ • , xw) 


where m(xi , • ■ • , Xj.) is independent of 0i 0g, then if the distribution of 
21 , • • • , 2 ji is found, assuming 0i , • - • , 6g have specific values 0l y 


then it follows that 

p(zi, ,Zk\0i, ,0ll) = p(2i» • • ■ , z* 




el) 


h{zi, 


h{z 


1) 


Zk\01, •• • ,0g) 


We may call Anderson’s distribution given in (1), p(12y , 7;^ [ 1, 0), i.e. 


p(12jv, 7^; 1 1, 0) = D{RK,Vif) 

Furthermore, x is distributed independently of R^ and Yn for all values of A 
and B and hence by a simple transformation,® we can apply th^ above theorem. 


< Anderson loc. cit p. 3 and p. 5. Although the remainder of the note deals only with 
the case where L = 1 the procedure is general and may be easily carried through for other 
lags 

‘ See W. G. Madow Contributions to the "Theory of multivariate statistical analysis”, 
Trans, of the Amer. Math Soc., Vol. 44, No. 3, November 1938, p. 461. 

" For a proof that an orthogonal transformation of the variable x, — ;i exists such that 
Vk and iPk are simultaneously reduced to canonical forms involving the same JV — 1 of 
the variables of the transformation, and VA (* — A) is the Nth variable of the transforma¬ 
tion, see J. von Neumann, "Distribution of the ratio of the mean square successive differ¬ 
ence to the variance. Annals of Math. Stai , Vol. XII, No. 4, December 1941, pp 368, 369. 
The proof there is given for Yn and S(Xi — x,+i)® but iS easily extended to this case 

t 

Then it is easy to show that Nii — n) is independently distributed of Vn , and iPn and 
has distribution 1 og p [VA (x — m) IA, B] = log — i [A + 2B]A(x — m) i where Jf* = (2ir 
(A + 2B)l and K[Ki = K, 



310 


WILLIAM G. MAGOW 


Then 


p{R>^ ,V^\A,B) = p{lt >,, 7iv I 1, 0)S2 


where 


Hence it follows that, 








p{R >,, 7^ I A, B) = E (X. - , 


1-1 


for Xm+i < < Xm , where the ajhave different values according to whether N 

is odd or even. In order to evaluate p(i2jr) A, B) we then need only integrate 
out 7w. Now 


Jo 

Hence 

p(R^ 1 A, B) = mN - l)](A/2 + E (X, - 

i'mi 

The parameters Ki, A and B depend on the different tyixis of assumptions that 
may be made. In general 

where A is a circulant (ai, ■ • • , oy) such that 

ai = A, ai+L = B, ai+<w_t) - B, a. = 0 otherwise, 
and hence 

•A = n (^+5 cos = n (^ + J3xd. 

Then, one assumption is 

A = -j, B = -p/a 

(f 

where p is the “true” serial correlation coefficient. Other assumptions are 
possible.^ However, these vary with the problem under consideration and may 
be left for further examination. 


» One poasible alternative definition ia given by W. J. Dixon, “J’urthor contributions to 
the problem of aerial correlation", Annals of Math. Slat., Vol. XV, No. 2, June 1944, p, 120, 
equation (2.1). 



NOTE ON A PAPER 


311 


NOTE ON A PAPER BY C. W. COTTERMAN AND L. H. SNYDER 

By H. B, Mann' 

Ohio Slate Umversity 

C. W. Cotterman and L. H. Snyder [1] gave a method to test simple Men- 
delian inheritance in randomly collected data From a population assumed to 
be at equilibrium a sample is taken The number of homozygous recessives in 
the sample is known We Wish to estimate the number ot heterozygous individ¬ 
uals in the sample. 

Let a be the proportion of recessive genes among all genes in the population; 
ir, p, T the proportion in the population of homozygous recessives, heterozygous 
and homozygous dominant individuals respectively and p, r, t the sampling 
values of ir, p, t. Then 

(1) TT = a, p = 2a(l - «), T = (1 — a)"*, p + r + t = 1. 

Cotterman and Snyder use as an estimate of r the quantity 2-\/p(l — -y/p). 
It is the purpose of this note to show that this estimate is for all practical purposes 
equivalent to the maximum likelihood estimate of r. 

The joint distribution of p, r and t in samples of n is given by 

... p, ^ ^ ^ - «)]"(! - 

^ (np)!(nr)!(nt)t (np)\{nr)l{nt)\ ’ 

where P(p, r, i) is the probability of obtaming the values p, r, t in samples of n. 
We wish to maximize P(p, r, t) for fixed values of p with respect to a and r. 
Maximizing first with respect to a one easily obtains 

(3) 2ol = 2p r. 

We can regard a as a continuous parameter and hence (3) must hold at any 
maximum of Pip, r, t). For any maximum of Pip, r, t) wt, must further have 

inp )! (nr) 1 (nf) 1 (np)' (nr -b 1)! (nf — 1) 1 
and 

_ I np «r nt ^ I np nr—1 nl—l 

nlir p r ^ niTr p r 
inp)\ inr)] int)\ (np)!(nr — l)!(nf -f- 1)!‘ 

This leads to the inequalities 


(4) 


z > p p_ > _ 

nt '' nr + I* nr nt + 1' 


Substituting 1 = 1 — p — r, t = 1 — ir — p one easily obtains from (4) 


’ Research under a grant of the research foundation of Ohio State University. 



312 


H. B. MAKN 


( 6 ) 


pft - p np + p. . pn~ imp-T 
n(l - tt ) ?].(1 -■ t ) 


1 


The difference of the two boiindfi is", Hence r must fiatiflfy an equation 


n 


pn-pnp-fp « n<.<i 

Substituting the values for p, t and r from (1) and (3) we obtain 


471 471“ 71 ■ 

Since 0 ^ ^ 1 we obtain from (3) 

From (6) we see that for all practical purposes we may use the estimate 

r = 2^17(1 - Vp). 


IIEFERENCE 

[1] C. W. CoiTERMAK AND L.H. SNYDiin, "TcBts Of flimplo Mendcliftu inhoritanco in ran¬ 
domly collooted data of one and two generations/’ Jour, Avi, ^tal. Am,, 
Vol. 34 (1039), pp, 61H23. 



NEWS AND NOTICES 

Readers are invited to submit to the Secretary of the Institute new items of interest 

Personal Items 

Dr. R G- D. Allen, who has been associated with the Combined Production 
and Resources in Washington has returned to the London School of Economics. 

Dr. Kenneth J. Arnold, who has been doing war research work with the 
Columbia University Statistical Research Group has returned to his position 
at the University of Wisconsin. 

Dr. Lee A. Aroian, on leave from Hunter College is serving as a research 
associate in the Applied Mathematics Panel Project at Berkeley, California 
under the direction of Professor Neyman. 

Dr. Ernest E. Blanche, has been appomted to the teaching staff of the Army 
University organized by the War Department for American veterans at Florence, 
Italy. 

Assistant Professor Z. W. Birnbaum of the University of Washington has 
been promoted to an associate professorship. 

Dr. Alva E. Brandt has returned from the Operational Research Section of 
the Ninth Air Force in Europe. 

Associate Professor R. S. Burington of the Case School of Applied Science 
has received the Meritorious Civilian Award from the United States Navy. 

Dr. Irving W. Burr has been piomoted to an associate professorship at Pur¬ 
due University. 

Miss Frances Campbell, after leceiving her doctorate at Michigan in June, 
has returned to her position at George Pepperdine College, Los Angeles. 

Professor Harry C. Carver, after a year of service with the Army Air Forces, 
has returned to the University of Michigan. 

Professor W. G. Cochran has returned to Iowa State College from a special 
mission to Germany. 

Professor Churchill Eisenhart, who has been doing war research work with 
the Columbia University Statistical Research Group, has returned to the 
University of Wisconsin. 

Miss Maiy Blveback has been appointed to an assistant professorship at 
Rockford College. 

Assistant Professor C. H. Fischer of the University of Michigan has been 
promoted to an associate professorship. 

Mr. Elvin A. Hoy, who has spent three years with the War Production Board, 
is now Chief of the Statistics Section of the Bureau of Research and Statistics 
of the Social Security Board. 

Professor P. L. Hsu of Kimming, China, has been appointed to a visiting 
professorship of statistics at Columbia University, beginning January 1946. 

Dr. Doncaster G Humra has received an honorary Doctor of Science degree 
at Bucknell University 


313 



314 


NEWS AND NOTICES 


Mr. Joseph M. Juran who has served during the w'ar with the Foreign Eco¬ 
nomic Administration, is now Chairman of the Department of Administrative 
Engineering at New York University 

Dr Eugene Lukacs has been appointed Professor and Head of the Mathe¬ 
matics Depaitment at Our Lady of Cincinnati College. 

Dr, R. V. Mises of Harvard University has been appointed to a professor¬ 
ship of aerodynamics and applied mathematics. 

Professor A M. Mood has returned from Princeton University to his position 
at Iowa State College. 

Assistant Professor Henry Scheff^ of Syracuse University has been granted 
leave of absence to serve as senior mathematician with Princeton University 
Station of Division 2 of NDRC 

Symposium at the University of California 

A Symposium on Mathematical Statistics and Probability was held at the 
Univemity of Cahfomia at Berkeley on August 13-18, 1945. Those partici¬ 
pating in the symposium as speakers or chairmen were: 

Dean G P Adams, Prof E. B Babcock, Prof. E M. Bcealey, Prof B A Beinstein, Prof. 
Egon Brunswik, Prof, A H Copeland, Prof P H. Daus, Lt. Comm F W Dresoh, Prof 
G. C Evans, Miss Evelyn Fix, Prof Harold Hotelling, Prof Victor F. Lensien, Prof Jay L. 
Lush, Prof. J. H McDonald, Prof George F. McEwen, Prof J. Neyman, Prof G Polya, 
Prof. Hans Reiohenbach, Prof A C. Schaeffer, Prof. Morgan Ward, and Dr Jacob Wolfo- 
witss. 


New Members 

The following persons have been elected to membership in the Institute • 

Abbey, Helen, M.A. (Michigan) Stat, Bur. of Records & Stat. Mich. Dept of Health, 916 
N Chestnut, Lansing, Michigan. 

Acton, Forman, Ch. E. (Princeton) T/4 Army of the U S , SED Barracks Area, Oak Ridge, 
Tenn. 

Aitchlson, Beatrice, PhD. (Johns Hopkins) Econ & Stat. Analy., I, CC. 1929 S St., 
N.W Wash,, 9, D C. 

Aimer, George, A B (Western Reserve) Stat. Ohio High Plan Sar , B76 So 18th St. §199 
Arlington, Va 

Bartlett, Maurice, D Sc, (London) Univ Lecturer, Cambridge, 1S7 Chesterton Road, Cam¬ 
bridge, Eng. 

Berwick, Leo, A.B. (New York Umv.) Capt, A C Asst to Surgeon Stat, Unit of Psych. 
Sect. Hq. AFTRC, T & P Bldg , Fortworth 9, Texas 

Blackwell, Asst. Prof, David, Ph.D (Illinois) Math Dept. Howard Vniv. Wask , D. C 

Borland, James, M.A. (Indiana) Capt, Ex. Officer, Inspect. Office, Pine Bluff Arsenal, 
Ark 

Brown, Prof. Theo., Ph.D (Yale) Bus, Stat. Harvard Bus. School, Soldier's Field, Boston 
63, Mass, 

Bunke, Alfred, M A (Columbia) Sen. Stat. N Y. State Dept, of Labor, 37 Parkwood St. 
Albany 3, N. Y. 

Burlngton, Asso. Prof, Richard, Ph.D (Ohio) On leave from Case School of Applied Sci¬ 
ence, Cleveland, Ohio, at Present, Head Math , Bu. Ord USN 6900 N. Carlin Spring 
Rd , Arlington, Va. 



NEWS AND NOTICES 


315 


Campbell, James Ph D (Edinburgh) Univ Math Lecturer, Victoria XJniv Coll. Welt, 
W.J New Zeal 

Churchill, Edmund, A M (Columbia) 1686 Union Port Road, New York Z, N. Y. 

Cornfield, Jerome, B S (New York Univ.) Siat. Dept of Labor, RF D %Z Herndon, 
Va 

Ctuden, Dorothy, A B. (California) Stat in Sampling Sect Spec.Sur Div Bur of Census 
% Pop. Dtv. Wash , D. C. 

Daniel, Cuthbert, M S (Mass Inst Tech ) Stat Eng , Carbide and Carbon Chem. Corp , 
460 East Drive, Oak Ridge, Tenn 

David, Florence, Ph D (London) Univ Sect Stat Dept Untv Coll, London, W.C. 1, 
England 

De Garls, Prof. Charles, Ph D (Johns Hopkins) Untv. of Okla School of Med , Okla, Ctty, 
4. Okla 

Echegaray, Miguel, C E Ag. Attache to the Spanish Embassy, Z700 16th St N.W. Wash., 
D. C 

Ede, Richard, B.S (Wisconsin) Chemistry Devel. Metallurgist, Gary Works, Car. Steel 
Ill. 647 Ftllmore St, Gary, Indiana. 

Ewart, Robert, AB. (New York Univ.) Research Physicist, Ballistics Dept. Dea Moines 
Ord. Plant 683'46th St. Des Moines 12, Iowa. 

Federer, Walter, M S (Kansas State) Research-Ag Stat Siat Lab , Iowa State Coll. 
Ames, Iowa 

Freeman, Richard, B Sc. (McMaster) Research Chemist. 1 Maple Ave., Hamilton, Ontario, 
Canada 

Goldrosen, David, B S. (Worcester Poly Inst) Lt USNR Quality Control Officer, Insp. 
of Naval MatT 204 Ward St Newton Centre, Mass. 

Goodman, Albert, Supervisor Stat. Control, Quality Control, Weatinghouse Elec. Corp., 
Easinglon, Pa. 

Grant, Asst. Prof. David, Ph D. (Stanford) Dept, of Psych , Univ of Wis., Madison 6, 
Wisconsin 

Greenhouse, Samuel, B S. (City Coll N. Y ) TI4 U.S. Army, 6815-lSth St N.W. Wash., 
11, D C. 

Gretton, Owen, A.B (Brown) Acting Chief, Ind Div. Sen. Econ , 10167 Old Bladensburg 
Road, Silver Spring, Maryland. 

Hayden, Byron, A.B. (Geo. Wash. Univ ) Econ. Stat A. A. F. Wash D. C. 1301 S. Cleve¬ 
land St , Arlington, Va. 

Hecht, Bernard, B E.C. (City Coll, of N. Y ) T/agl, 616 Corp , Army-Navy Electronics 
Stand Agency 42 Washington Village, Aabury Park, N J. 

Haufek, Lyman, MBA. (Northwestern) U. 8. Army Hq. ASF, Chief Supply Stat. Unit, 
1121 New Hampshire Ave., N W , Wash. 7, D. C. 

Kampschaefer, Margaret, A B. (Indiana) Stat. Bur. of Labor Stat. 1087 E. Blackford 
Ave., Evarisville, 13, Indiana. 

Kozaklewlca, Waclaw, PhD. (Warsaw) Inst in Math., Univ of Saskatoon, Saskatoon, 
Canada. 

Laguardla, Prof. Rafael, Director of Math & Stat. (Univ. of Uruguay) Fine Hall, Prince¬ 
ton Univ , Princeton, N J. 

Leighton, Walter, Ph D (Harvard) On leave at Northwestern as Director, Applied Math. 
Group (NORC) Lecturer in Math. The Rice Inst. 1704 Judson Ave , Evanston, Illinois. 

Llebleln, Julius, M A. (Brooklyn Coll ) Econ. Anal. Room 4013, U S. Trea. Dept , 15ih 
& Penna. N.W. Wash 26, D. C 

Lien, Roy, M S. (Oregon State) Rate Stat., Northwestern Elect. Co., Portland, Oregon, 
3121 S E. Division St, Portland 2, Oregon. 

Lonseth, Asst. Prof. Arvid, Ph.D. (California) Math Dept. Northwestern University, 
Evan, III 



316 


NKWS AND NOTICBH 


Mlohalup, Eric, Pli U (l^niv of Virnim) Malh A Astronomy Actuary, Apartmlo 848 
Caracas, Venciuela. 

Monro, Sutton B.S. (Mass. Inst Tech.) Head of Str. Staff Unit. Amin, Div. Naval Ord. 

Lab, Lt. USNIl SPS Martha CusUa Dr. Alexandria, Va. 

Nllson, Hugo, Ph.D (Minnesota) Chemist in Charge, Fishery Tech Lab. U. S, Fish A 
Wildlife Sere. College Park, Maryland 

Nichols, Russell, B.A. (DePauu') (Sergeant, V. .S'. Army Co. A. BBO A /, Kn APO BBS 
NYC(3S-74B-m). 

O’Nell, Frank (I/iwell Textilo Inst.) .Senior Textile Teehnieinii, W'nrs/ed Division, Pacific 
Mills', Laurence, Afasa. 

Rappaport, Gladys, B.A. (Hunter) Jr Slat Slat. lleHoareli (Irmip, CS.hinibin, Umv , 
BJBO Tiehout .dee., Bronx 61, Meiu York. 

Rice, Assoc. Prof. Nelson, Ph J) (C tf. of A ) 3SBB ISlhSl A! K., Wash., il, 1) C. 

Schell, Emil, M.A (Western Reserve) Slat. Employment Stal Div. S 440 A'. IB ltd 
Arlington, Va 

Schneberger, Richard, (Cert, to tench in Tech High School Training for Industry State 
Programs) %Bdiaon Gen. Elec, Appl. Co , BSOO IT. Taylor El., Chgo,, III. 

Simon, Geo,, Ed. M, (Harvard) Capt., A C Avia. Psych. Psych. Section, Surgeon, Ilq, 
AFTRC, FL Worth B, Texas. 

Spaulding, Asa, M.A, (Michigan) Actuary & Asst .Sec, No. Carolina Mul Life Ins. Co. 
Durham, Norlh Carolina. 

Spoerl, Charles, B.A. (Harvard) Asst, Treas. %.lffnrt Life Ins. Co. Hartford, Conn 
Springer, Wm,, O.E. (Columbia) Asst Vice Pres in cliarge of Research, Bristol-Myers Co, 
Hillside 6, New Jersey. 

Stock, J. Stevens, M.A. (American) Lt. USNR, lid. .Slat. Sect. Div. of Shore Est. & 
Civilian Per. iVat'v Dept., HBOS Garfield ,91, Bclhesda, Maryland. 

Stott, Alex, A.B. (Harvard) Lt. Comdr. USNR, 8800 Devonshire PI., jV.M’, Wash 8, D. C 
Taylor, Thomas, Ph.D. (Yale) Reacnrch Engineer, U. S. Testing Co. 4B Grover Lane, Cald- 
wll, N. J. 

Treanor, Glen, B.A. (Minnesota) Principal Tax EconomiBt, Bus & Ind. Research Div., 
Income Tax Unit, Bur. of Int. Rev., Room IBSB, Wash , D. C. 

Wherry, Robert, Ph.D. (Ohio Stato) On leave Dept, Psych. Univ, of N. C., Civilian Head, 
Stat. Anal. Unit AGO Personnel Research Section, B70 Madison Ave , N. Y. 

WUcoxon, Frank, Ph.D (Cornell) Group Leader, Insecticide & Fungicide, La , Amer. 

Cyanamid Co,, Stamford, Conn. R.D #1 Box S9a, Riverside, Conn, 

Wolff, Marlon, A B. (Hunter) Asst. Math. Stat. Stat, Research Group Div. of War 
Research Columbia University 17B4 Crolona Park East, New York 60, N Y. 

Unknown Addresses 

Recent mail has not been delivered to the following members of the Institute 
at the addresses listed. If anyone knows of the current address of one or more 
of these members, please notify the Secretary-Treasurer at once, 

Lt. (jg) Gordon L, Bockstoad—Aor, Navy 161 % Fleet Postmaster, San Fran., Cal. 

Dr. Charles Wm. Cottcrman~637 Hawthorne Road, Winston Salem, Norlh GaroHna 
Mr, James Davidson—Box 344, Christinneburg, Virginia 

S/sgl George Elmatrom-Det of Pat., Hospital Plant. 4176 APO % PM, NYC, N. Y. 

Mr, Henry Goldberg—401 W. 118th St. New York 27, New York 

Mr Henry Hoblcy—Box 166, Pittsburgh 30, Pennsylvania 

Mr, John Mandel--45 Kew Gardens Road, Kew Gardens, New Y'ork 

Mr, David F, Votaw, Jr., USNTC—Bainbridge, Maryland 

Mr. Edward F. Wilson—Keswick Colony, Keswick Grove, New Jersey 



REPORT ON THE RUTGERS MEETING OE THE INSTITUTE 

The Eighth Summer Meeting of the Institute of Mathematical Statistics 
was held at the New Jersey College for Women, Rutgers University, New Bmns- 
wick, New Jersey on Sunday, September 16, 1946, where the Summer Meeting 
of the American Mathematical Society was also being held, The following 
115 members of the Institute attended the meeting: 

C B. Allendocrfer, R. L Anderaon, T, W. Anderson, H. E Arnold, I. L. Battin, Archie 
Blake, C. I. Blisa, P. Bosehan, A. H Bowker, A. E. Brandt, G W. Brown, R H. Brown, T, 
H Brown, T A. Budne, R, S Bunngton, B. H. Camp, A. G, Carlton, P, C. Clifford, E. P, 
Coleman, T P. Cope, G M. Cox, H. B. Curry, J. H CuTtiaa, J. F Daly, J. H, Davidson, B 
B Day, W E, Detnmg, H F. Dodge, Jacques Dutka, P, S Dwyer, Churchill Eisenhart, 
Wade ElliB, Mary Elvjsback, Benjamin Epstein, C. D Ferris, C, H. Fischer, M. M. Flood, 
R M. Foster, Milton Friedman, J. P Gill, M. A. Girshick, Casper Goffman, A A Goodman, 
Dorothy K. Gottfried, T N, E. Greville, F. E. Grubbs, K W. Halbert, Marshall Hall, P. 
R Halmos, Miriam S. Harold, Millard Hastay, Bernard Hecht, William Hodgkinson, I. S. 
Hoffer, Harold Hotelling, A. S Householder, W. Hurwicz, Irving Kaplansky, C. J. Kirchen, 
Jack Laderman, Rafael Laguardia, H G. Landau, Howard Levene, Harriet Levine, S. B. 
Littauer, A. T. Ixinseth, P J McCarthy, W G. Madow, J. W. Mauchly, E. B. Mode, D. J. 
Morrow, J, E Morton, Judith Moss, P. M Neurath, M. L Norden, H. W. Norton, C. 0. 
Oakley, P S Olmstead, Edward Paulson, John R30dan,H,E Robbins, H G Romig, William 
-Salkmd, M. M. Sandomire, Arthur Sard, F E Satterthwaite, L. J. Savage, Henry Scheffd, 
Bernice Soherl, Edward Schrock, I, E. Segal, C. E Shannon, L. W Shaw, Herbert Solomon, 
Mortimer Spiegelman, J. R Steen, Arthur Stein, P. F. Stephan, A. P, Stergion, L. V. 
Toralballa, Mary N. Torrey, A. W. Tucker, L. R Tucker, J. W Tukey, Helen M. Walker, 
W. A, Wallis, R M. Walter, B T. Weber, Joseph Weinstein, A. E. R. Westman, Frank Wil- 
coxon, S Si Wilks, Jacob Wolfowitz, C. P Winsor, Ruth Zwerling 

The first session, on Sunday morning, Avas devoted to a symposium on Se¬ 
quential Analysis. Professor W. Allen Wallis, of Stanford University and Colum¬ 
bia Statistical Research Group, acted as chairman for this session. The fol¬ 
lowing invited addresses were given. 

1 Theory of Sequential Analysis. 

Professor A, Wald, Columbia University and Columbia Statistical Research Group. 

2. Construction of Multiple Sampling Inspection Plans for Attributes from Sequential 
Principles 

Mr Milton Friedman, National Bureau of Economic Research and Columbia 
Statistical Research Group. 

3 Applicalions of Sequential Analysis lo the Ranking of Two Populations with Respect to 
a Single Parameter. 

Mr, M A Girshick, Bureau of Agricultural Economics and Columbia Statistical Re¬ 
search Group, 

The morning session AAms concluded after lively di,scussion on the sympoaium 
topic. 

Dr. W. Edwards Deming, of the Bureau of the Budget and President of the 
Institute, presided at the afternoon session The following papers were pre¬ 
sented : 


317 



318 


hepout on the hutokrh meeting 


1. On fk Variance aj a Handoin Set in n JHmmm. 

Dr. Herbert E. Robbins, Post Graduate School, Annapolis. 

2. fk Non-Central JVuharl Distribulm and ile Applmlion to Problem In Mullwariate 
Slalieltcs, 

Dr, T, W. Anderson, Jr, Princeton University, 

3. Tk Ejjeei on a DMtuim Function of Small Cknp in the Population Function. 
Professor Burton H Camp, Wesleyan University 

4. On Composite Dislriktum. 

Dr, Casiier Goffman and Dr. Benjamin Epstein, Westinglioiise Eleclnc Corp, 

5. Popuklion, Expected Paines and Sample, 

Professor Emil J. Gumbel, New School for Social Research 

6. On tk Selection of a Sample in Repeated Steps. 

Dr. W, G. Madow, Bureau of tlie Census. 

I On Oplmni Estimates for Stratified Samples, 

Mr. Morris H, Hansen and Mr. William N. Hurwitu, Bureau of the, Census Presented 
by Margaret Gurney. 

8, Pearsonian Correlation Coefficients Associated With Least Squares Theory (Presented 
by Title). 

Professor P. S, Dwyer, University of Michigan. 

The afteraooa session concluded with the report of tlii' Coiuniittee on the 
Teaching of Statistics which was pre.sentcd by Profi'.^.'^oi’ Harold Hotelling of 
Coliiinbia University. 

P S. DWYEIt, 
Semtary 



ON THE NORMAL APPROXIMATION TO THE BINOMIAL 
DISTRIBUTION 

By W Feller 
Cornell University 

1. Although the problem of an efficient estimation of the error in the normal 
approximation to the binomial distribution is classical, the many papers which 
are still being written on the subject show that not all pertinent questions have 
found a satisfactory solution. Let for a fixed n and 0<p<l, g = 


( 1 ) 


Tk 


-it) 


t „n—k 

V Q , 


Px.. = E Pi • 


For reasons of tradition (and, apparently, only for such reasons) one sets 


( 2 ) 


Zk = (k ~ np)(f 


a = (npg) 


1/2 


and compares (1) with 


( 3 ) 


Nk = and Ux., = - ^(sx - 


respectively,^ where $(«) stands for the normalized error function. Many 
estimates are available for the maximum of the difference | Px,^ — IIx.v | for all X, v. 
Now this error is 0((r~‘) and even a precise appraisal will break down in the two 
most interesting cases: if <r is small, or if X and r are large as compared to cr. 
Indeed, even for moderately large values of k (such as are usually considered) 
the contribution of IT* to the sum in (1) will be considerably smaller than 
so that any estimate of the form 0(o-~') leaves us xvithout guidance With some 
modifications this remains true also for more refined estimates like Uspensky’s 
remarkable result'' 


(4) 

with 


Px.v = Ux,, + 


g - p 

6 (t(2t)''' 


[(1 - z^)c 


,2, -r2/2i 


i” W 


IojI < (.13 + + 

provided a > 5 What is leally needed in many applications is an estimate of 
the relative error, but this seems difficult to obtain. 

It should also be noticed that the accuracy of the normal approximation to the 
binomial is by no means quite as good as many texts would make appear. Exam- 


1 Very often, the limits z\ and Zp instead of ?> + — and rx “ ” arc used This naturally 

'Z(X 2o' 

results in an unnecessary systematic undervaluation 

“ Uspensky [3], p 129 A two-term development of 2'r with an error of 0(a~^) valid for 
I a: I < 2, ff > 3 has been given by Miriraanoff and Dovaz [1927] 

319 



m 


W, KEI,LEU 


ph‘s ii.siiiK p - \ find inlc'i'vul.s which arc symmetric with icspoct to np aie hardly 
conchisivc, since then' the mam eiTor term drops out and systematic positive 
and iieKulivi' eriors cancel. Again, in practice comparatively small a- and com- 
paratii'cly large p an* freipiently u.sed. It works well to compare a P\,„ of a 
numci'ical value, say, .!Kt with a corre.sponding value IIj,., of, .say, .95 IncIafl,S' 
room discussions the error may seem insignificant, However, in most actual 
applicalion.s one would eon.sider the complementary probahilitie.s, and the very 
sanu' ligiires mean an appro.vimation .05 to the o,orrect value 07. If a confidence 
limit is set to the five jier cent level, the, uormal approximation w'ould in our 
example mean that two out of seven eritieal eases are missed, (lunsider ne,xt the 
cxamplv p - n = 10,000. I'kir values of k arnimd 1120 the, relative error of 
.Vi is about .30; it incrcase.s rapidly with increasing k. Around k = 1150 the 
relative error exceeds 2/3, around U80 it is nearly 1.4. And yet this example 
is conservative in comparison with many cases where the normal approximation 
is used in practice. 

It IS .surprising that the classical norming (2) is generally accepted although 
there docs not seem to exist any deeper reason for it. The use of moments, 
though u.suully very convenient, docs not nece.s,sarily lead to be.st results. For 
example, the density function 

(5) /n(a') =“ —1.1' c 

t.s the {n + 1 J-fold convolution of /»(«) with itself and therefore, for largo n, 
of nearly normal “tyiie.” The conventional norming would approximate 
/„('x) by lZ7r(n + .^Yhile the use of the norming factor n 

instead of (n + 1) seems clearly indicated 

Actually, as will he seen, it is natural (at least for small values of fc — np) 
to replace (2) by 

(f5) XL = jA; + i - (a + 

and accordingly to approximate by the error integral taken between the limits 

(7) lx - (n + and t" + 1 - (n + l)pl<r“'. 

Foi example, let p = ■^,n ~ 500, X = 50, v = 55. The correct value is P^om ^ 
.317573; the norming (2) leads to nM.66 ^ .32357, while the more natural limits 

(6) lead to an approximation 31989. More important are the quite unexpected 
simplifications which the norming (G) permits when one studies the erroi for 
large xl or small tr. 

We arc now led to reformulate the problem: imtead of starling with arbitrary 
limits for the error integral and to estimate the resulting error, we shall try to determine 
the limits so as to mtnimize the error. Theoretically, for any given X, p these limits 
could be determined .so as to give an exact value for Pl., . However, such limits 
would depend in the most intricate way on X and p. For practical purposes one 
W’ould restrict the con.siderations to certain simple functions such as polynomials. 



APPROXIMATION TO THE BINOMIAL 


321 


We shall here consider only the case where the limits are at most quadratic 
polynomials. Essentially our problem seems that treated by Serge Bernstein 
(and, apparently, only by him). In a series of papers since 1924, S. Bernstein 
has considered the accuracy of the normal approximation Quite recently* 
he has, by a considerable computational effort, extended the range of validity 
from npq > 365 to n'pq > 62 5 and proved the following 
Theorem (S. Bernstein): Let 

(8) npq > 62 5 

and let a*, /S* he the solutions of the quadratic equations 

X - I ~ np = a^(npq)^'^ -)- 4 

(9) 

X + i ~ np = fi:.(npq)^'^ -b Pl ■ 

If 


(10) 

a > 0, 

|8 < 2’'*(npg)'"’ 

then 



(11) 

$(/3v) - 

Hfil) ^ jPx,)- < 4 ‘(q:k) — 


The conditions (10) are practically equivalent to 

(12) I' > njo + I, v<np + 2‘'V*'* 

The remarkable feature of this excellent result is that the error remains 0 (<t~^) 
throughout an interval which increases ivith u (instead of the conventional uni¬ 
formly bounded intervals). 

In the sequel it will be shown that startlmg simplifications can be obtained if 
the norming (6) is used from the beginning instead of (2) Our mam result is an 
improvement of S. Bernstein’s theorem The condition (8) will be replaced by 
(n -f- l)pq > 9 The first condition in (10) will be relaxed to /t > (n + l)p, that 
is to say, our theorem will hold for all fc exceeding the central value (for those less 
than the central value an analogous theorem holds); in the other condition (10), 
the numerical value 2*^* will be replaced by an arbitrary constant. Instead of 
quadratic equations, we shall consider quadratic polynomials. And finally, the 
gap between the two sets of limits will be reduced, 

It ivill be seen that the computations leading to this improvement arc almost 
negligible in comparison with S Bernstein’s deeper method, with slightly more 
sophisticated arguments and numerical evaluations, our results can be con¬ 
siderably improved, Our consideration will be based on a new expie,ssion for 
Tk, in which only exponential terms appear but the usual square root i,s missing, 

^ S. Bernstein [1], the first paper of the senes appears to have appeared m Uernyo Xapiski, 
Kiev, 1924 



322 


W. FELUEH 


In passing from approximations to Ta to approximations to P^,, one has to 
replace sums by integrals. This procedure is cumbersome if an estimate of the 
relative error is desired. Euler’s formula and other standard formulas are of 
little use We shall therefore start witli a lemma which, it is hoped, may be 
useful in this connection; It will therefore be proved in a slightly more general 
form than actually required for the present paper. 

2. Lemma* 1. For 0 < h < ^ and | ih | < 1 

fiHli 


(13> 

mlh 

(14) 


/. 




du = he 




i-hf2 


I 


880 - " “ 285' 


Proof. Denote the integral in (13) by J. Then 


(15) r'= r* £ 


hii 


-it-tVi 


dl 


A/2 


= 2/r‘ 

Jo 


chxte~‘’'^ dt. 


We begin by showing that for 0 < a < 4 
(IG) 

In fact 


(17).*„ > (i+i+i)(i+£ 1+^+1* E (0 > 

and 

>(i+1+i)(i - i) > 1+1+^ 


(18) 


It follows from (15) and (16) that 

fA/2 

h~^d"' 




> 2r' f 

Jo 


(i2-0(»/(l-rU*/66 /i*-m2/3-4iUV66 j, 

e -e at 


(19) 


Ml 

> 2;r‘ 

Jo 




1 +■ 


- 1 0 _ 4a;* t* 
'3 ‘ 55 


dt 


which proves one part of the lemma. 

To obtain an upper estimate we make uso of the inequalities 


„(** m’/a 




-(»/ar*U4ns 


' The fraction i is chosen quite arbitrarily; if A be restricted to 0 < A < 1 the first member 
of (14) remains unchanged, while the traction — on the right side has to be replaced by r^. 



APPEOXIMATION TO THE BINOMIAL 


323 


1 - — 

( 20 ) <(l + ^'^^1 - 

^ “ 3 

X ~ 1 j2\ i<l</18+Ji^/2B6 

V 

Using (16) and (20), the proof of the second part of the lemma follows from a 
computation analogous to (19). 

For our purposes'll is convenient to use Stirling’s formula in a form which is 
not quite the usual one. 

Lemma 2. {Stirling’s formulas). For n > 4, 

^21) .^1 = (2ir)^(n i)'‘+ie-(’‘+i^-i'2««+i>+P/“88i))(i+jL)/cn+i)3 


<(i + 


I I < Ff 1 ? ^ 0 as n —> ». 

Formula (21) can be derived from the gamma function or m any other way 
that leads to the standard form (22).^ 

3. From now on we shall put 

(24) <r‘ = {n + l)pq 

(25) XL = [fc + I — (n. + ^)p}<r~^', 


or 

( 22 ) 

where 

(23) 


the subscript k will be omitted whenevei no confusion is to be feared. To tians- 
form Tk we shall use (21) for the factorials in the denominator, but (22) for 
(n i- 1)! in the numerator. 


3 A simple proof runs as follows. Put Ba = Jiifu + i) Then 

P .-1 A fi 1 1 1 _7i + a, 

B, 2./(2v + D) (2p)2' 60 (2p)< 

with 0<6i<^ifp^5 From here (21) follows using the fact that 
V log = log Bn — i log {2ir) 

p-n+l 

and that for n ^ 4 

1 - S “ 1 1 

3(n + i)” ^ 3(n + i)“ 

3 

wilh 0 < 3 < —. In this way the estimate (23) can be considerably improved. 



324 


FKIJ.KR 


Then 


lCK((27r)*<rr*) = (71 + 1)]0K(71 + 1) - (fc + J) log 

1 


+ \ 
P 

1 


(71 . k+ log - ^ ^ ^ + 24 (^. + 


(26) 


+ 


24(7t - k + i) 


f27) 


.7/_ 4-_j 

- - 6\360(n +'i)* 2880 Lffc + W (n - ^“-f- 


+ ?1 + 

12<r"- ^ 


<^..1 ^ 
- 0 36()<r‘ 


p“<? + + 9 )N 


provided only that A; > 4, (n — A;) > 4. Asymptotieally p is (Hiuivalent to the 
right-hand member without factor J (which, by the ^vay. could be replaced by 
1 + ?V)- Obviously 

1 


(28) 




if A: > 4, n — A: > 4. We shall consider later on the case c > 3, | a‘ | < jo-; 
then clearly A:>4,n — A:>4, so that the use of (28) will be justified. Plxpand- 
ing (26) into a power series we obtain 
Theohem. // k > 4, 71 — k > 4, 


Tk - (27r) ' cr' exp 


(29) 


{- gY 


y L_ 

V y(» - 1) 


X 




( r -- 


+ 2-pq 
24(7^ 


- P 


where p satisfies (28) (and (27)); a; and <r aie defined by (25) and (24), respectively 
Each term of the second aeries will usually be small as compared to the cor¬ 
responding term of the first series; the second scries can therefore, if desired, be 
absorbed in the error term. If x is small the first term of the first scrioa will be 
preponderant, However, as x increases, more and more terms will make them¬ 
selves noticeable; if a; three terms •will bo essential, and so on. 

Formula (29) permits us to approximate P\,t by means of integrals. The 
tangent rule would suggest to compare A,» to 


* (*■+s) “ * ("' - £)’ 


( 30 ) 



■VPPROXIMATION TO THE BINOMIAL 


325 


and (29) together with lemma (1) permits easily to estimate the relative error 
in the practically most important cases It is also seen that the limits in (30) 
are essentially the only limits depending linearly on X and v w'hich will render the 
relative error 0((7 ^) for x = 0(1). Instead of elaborating on these simple 
questions we proceed to the more intriciate problem of limits which are quadratic 
polynomials in X and v. 

4. For brevity w'c shall from now on put 


(31) 


6 ■ 


= a 


The estimate | a | g ^ will be used constantly It obviously suffices to consider 
values of X < V which exceed the central value ((n + l)p]. 

Theorem Suppose that 

(32) 


and 

(33) 
Then 

(34) 

(35) 


£r > 3 

X > (n + i)p r + 5 < (n + l)p + fcr^. 
Px,. < - 4>(W), 


Vk = 


fc - (n + l)p jk - (n + l)p\- 

<r O’ 


/ 


^ _ 1 
^ (T 2cr“’ 


while the inequality in (34) is leverscd if 
Vk 

where 
(37) 


( 36 ) „ - h - »t . Pi’ + i h - (" + Di’ V + + 

O’ (T [ <T j (F Off Iff 


_ _ (r + i - in + l)pl° 


a cr 


The gap between the limits (35) and (36) is if xl = 0(cr). In S. Bern¬ 

stein’s case (12), M < \/2 and the gap is about 2/(5a-). It will be seen from 
the proof that it requires only routine computations to improve the correction 
M , l\__i 


term + -|<r in (36) 
Proof. Put 


(38) = .rx + -xl, 

ff 

again suppressing the subscripts wherever convenient. As a consequence ot 
(33),, we shall be concerned only with values xi, satisfying 


( 39 ) 



326 


W. FKLLKM 


Consider first the main w'rica in (29) and write 
' fiY' y 

ff' 


(40) 

where 

(tl) 




3 I 3 

T) + q 
12 


- I) 

2 \ 4 

a \x 


= ht + ^4, 


- 2 -t- 2 


p'“‘ - (- qr' y 

« - n ' 


We shall require some cstiniatpH of .4. First conHidci' flic ejise a > 0. Then all 
terms of the series are positive, while the expression within parentheses assumes 
its minimum for p = liy (39) f < Y .r, whenee 


(42) 


A>.U if »>0. 


If a < 0 the signs in the series (41) alternate, caeh negative trim being smaller 
in absolute value, than the, preceding positive term. Therefore, using (39), 


(43) 


X > "1+/ 

- ^ 12 


a 

2 


Q - P 
30 ■ 


rri' 


The expr^ion within braces is a cubic in p which assumes its minimum for p = 
(1 4- \/793)/72 - .405, .. It follows that 


(44) 




(half of this estimate would actually suffice for our purposes). On the other 
hand, it is evident from (41) that the ratio A/x^ attains its maximum for p = 1. 
Therefore, using (39) 


(45) 

Next we write 

(46) 
whence 




(p-‘ - (-«)"! 


(r =5 


f + .B> 


r^z I ^3 2 - 00 / \ r^2 

(47) B.j[r^-|]j + -^Eir‘-(-,)-‘i(5) . 

A trivial computation analogous to (43) shows that B > 0. Again, if a < 0, 
the signs in the series (47) alternate and in this case 


(48) 


0 


r 8 r 3 2"1 2 

<B<i ^ < 

* L 12 2jy - 


A- A 

144 


<1^- 



APPROXIMATION TO THE BINOMIAL 


327 


If a > 0 we can majorize (47) by a geometric series and obtain 


- “ 8 0^ - 8 


Now put 




, , 2a > 
1 H— Xk 
(T ) 


(51) Ifc + Mb = b+i — Mb+i 

so that the mtervals with endpoints b ± Mb ^^e non-overlapping and con¬ 
tiguous. Clearly 


AJ..- , 


Introducing (40), (46), and (52) into (29) we obtain 


T, = (27r)-‘'M| ~ A + B ~ ^log 


0 + v) 


, l + 2pg 


To appraise the logarithmic term we write 


}log(l + ^^)-^'-C 


C f ‘ attains its maximum value when a = — and it is readily seen that 

0 < C < if o > 0 

(55) „ V 

0 < 0 < ^ if a < 0. 


Rnally we put, with a parameter u to be determined, 


2 / = ? + 


2a — u 


Ay = Af. 


If one puts 


1 o 
“ “ ^ 4^ 


and Tji is defined by (35), then 


. 2/t + Mf/t = Vi+i, Vh — Mj/i = i)k . 



328 


W. VKIJ.KH 


On tlm otlu>r hand, if 

(.TO) 


n 


M 

■f) 


I _ a 
7 4^= 


and rii. is dt-liiipfl by (3()), tin' idralititm (58) liohl again. Aroordingly, all we have 
to ,slu)w i.s that, nith u (Ipfiiusl by (57), 

( 111 )) n < t'h(.V* + iA!A) - <l>(fA - 


and that the inequality in (tit)) is reversed if u is delined l\v (59), 
I'ilemenlary IransforinationH li-ad from (53) to 


(61) 

n = (27r'.'*Aye\p. 

+ 

1 

(Ay)* 

“2i 

where 




(62) i 

,,, 7i’ — 4aa / u 

5a \ 

^ ~ 2 

'2a2 V 

l2aV 


1 

24ff* 




A + S + C - p. 


Let now « be defined by (57). In view of lemma 1 and (1)1), the inequality 
((50) will be proved if we show that 


(63) 

Now clearly 

(64) 




23 **j“ 


yW 

880 


< 0. 


i y\AyV 

a? \ <r / 24 “ 880 ‘ 


Moreover, introducing the, estimates (28), (32), (-12), (4-1), (48), (49), and (55) 
into (62) it is seen that for a > 0 


(65) 

and for o < 0 
((56) 


< 


24a 64 ^ ^ 8 ' 74 ^ ’ 


a'Ex 


9(r 


- if + If - ~ {^ 


60 


Ahe derivatives of the light-hand members in ((55) and (6(5) are both negative 
for t > 0 Now we arc interested only in values r satisfying (39). For such 


values f > 


W7 

2l(w' 


For f = 


107 

2U)a 


the right-hand members in (6S) and (66) are 


negative, so that Ex < 0 for j- > “ . This proves the first part of our theorem, 

d’he proof that with (59) the inequality in (60) is reveised proceeds on similar 
lines. Wo have to show that 



(Ay)* 

285 


> 0 . 


(67) 



APPROXIMATION TO THE BINOMIAL 


329 


Suppose that a < 0, which ia the less favorable case. Then, by (45), (37), and 
(39), 


( 68 ) 


. 2M X ^ 
- l5 


Similarly 

(69) 



Using (62) we have therefore, neglecting the non-negative terms B and C, 


(70) 


„ > jU _ iL _ _J_ 

3(1' 250ff' 



_5_ _ ,W _ J_ t' 

72(t' 20(t 12£r'j 24(r'' ’ 


The expression at the right side represents a parabola, and it suffices to show that 
it assumes positive values at the endpoints of our interval (39). Now 


(71) 



?>-i 

3 - 18 


1 

1 

12(r' 


> - 


107’ 


and simple arithmetic shows that, with (59) the expression within the braces 
more than counterbalances the negative terms outside.® If a > 0 the situation 
IS more favorable and the estimate (59) can then be further improved, 

REFERENCES 

[1] S. Bernstein, "Retour au probl^me de I'^valuation de I'approximation de la formule 

hmite de Laplace,” [la Russian] Izveslia Akadevm Nauk SSR Senja male- 

matiieskaja, 7 (1943), pp 3-16 ^ 

[2] S Mibimanopf and R Dovaz, "Les ^preuves r4p4t6es et la formule de Laplace, C. 

R. Acad. Set , Pans, 18S (1927), pp 827-829 ■« u n 

[3] J. V Uspensky, IntroducUon to Mathematical Probability New York, McGraw-Hill, 

1937. 

Mia 

' A more careful computation shows that it suffices if we put a - - ^ instead 

M 1 0 



THE VARIANCE OF THE MEASURE OF A TWO-DIMENSIONAL RANDOM 

SET 


Hy J. BnnxowsKi and J Nkyman 
rnnrrs RiRhoroufjh, England .utul the University of California 


1. Introduction. In a roront paper H. E. Rohliins* Iuvh solved the problem 
of the variance of the mea.sure of a ono'diincnsional rancinin Hot. The present 
paper treats a similar problem relatinK to a two-dimensional random set under 
somewliat more general eonditions. 

IjcI R (Icaiote a retdaiiKle of dimensions a X h wlmse iiosition is fixed Let R' 
denote anothi'r fixed rectangle eoneentnr with /i, its sides a fry and h f- y (where 
y > 0) Ix'ing parallel to the sides a and h respectively of R. Finally, let p denote 
a riH'tangle of fixed dimemsions but v.ariable position, whose sides a < 27 and 
0 < 2y are parallel to a and h respectively, but the position of who.se center will 
be considered a.s random. In fact it will be u-ssumed that the rectangle p is 
dropped on the plane of R in a inannei which .satisfies the following two 
assumptions; 

(i) The probability that the center of p falls within R' exactly s times has a 
defined value P, for each s = 0, 1, 2, • • • 'flms, if TCu) denote.s the prohalidity 
generating function of s, so tliat 

to 

(1) ^(ii) = £u'F,, 

8»0 


then ^'(u) is assumed known but will bo left arbitrary fill the. general result is 
obtained. 

(ii) Whenever a fixed number s of centcr.s of p fall within R\ it will be assumed 
that the probability that exactly k centers of p fall within any chosen sub-area w 
contained in R' is given by the binomial expression 

(21 

k[{s~k)\R'\ R'j 


Under the above conditions, denote by E the set of all those points of R nhieh 
ai'e covered at least once by the rectangh* p during the course of tire trials con¬ 
sidered, Let X denote the measure of E. The puipose of this paper is to 
evaluate the first two moments of X. 

First, the computations will be made for the case when s is fixed, i.e. when 


(3) = it*. 

The values of the two moments of X computed for fixed s i\‘ill be, denoted by 
iUi(a,h|s) and A/j(a,6|s), Next, the moments of X will be evaluated for an 
arbitrary generating function '!'(«), and these will be denoted by Mi(a, t>) and 
Mi{a, b). 


> II. E, Robbins, "On the measure of a random set", Annals of Math Slat, Vol. 15 
(1044), pp. 70-74. 


330 



TWO'DIMBNSIONAI/ EANDOM SET 


331 


H E. Robbins has found the first moment 

(4) = a&{l-(l-^y} 

Also, for a one-dimensional set, he has obtained the second moment, say ilf 2 (tt|s), 
when a < a. 

It follows immediately from (4) and (1) that, whatever be the probability 
generating function ^(«), 

(5) M,(a,h) = 

In particular, if the probabilities P, are those of Poisson when the density of 
positions of the center of p per unit of area is X, so that 


(6) 

11 

then 


(7) 

Mi(a, b) = ab{l — e 


Our remaining problem, therefore, is that of evaluating the second moment of 
X. Instead we shall evaluate the second moment of 

( 8 ) Y = ab - X, 

and shall denote it by m(a, b | s) or m{a, b) according as s is or is not considered 
to be fixed. 


2. Derivative of the second moment of Y. In order to evaluate m(a, 6 ), we 
begin by calculating its second (mixed) derivative, say D{a,b \ s), where 


D{a, b 1 s) 


d^m{a, b | s) 
dadb 


(9) 


= lim {w(a -f ^a, 6 -|- Ab | s) — m{a, b Ab [ s) 

— m(o -t- Aa, b 1 s) -b m(o, h\s)\ 
= lim Ab) (say), 


where Aa and Ab are the increments of a and b respectively. Once D{a, b \ s) 
is found, the formula for m{a, b \ s) ivill be obtained by two quadratures. For 
definiteness we shall assume Aa and Ab both to be positive, but of course the 
argument which follows applies equally to other cases. 

Consider the rectangle of dimensions (a -f- Aa) and (b Ab) as shown in Figure 
1 , and denote by U, V and W the measures of the “uncovered” parts of the three 
rectangles Aa X b^ o X Ab, and Aa X Ab respectively. That is to say, V, V 
and W are defined with respect to these three rectangles precisely in the same 



332 


J. nilONOWSKI AND J. NEYMAN 


manner in which F is defined with respect to the original rectangle a X h s 
R. Using the letter E to denote the expectation, we easily find that 

J{Aa, Ah) - 2E{YW) + 2E{VV) 

+2E(VW) + 2EiUW) +E(W^). 

However, each of the three expectations in the second line of formula (10) ia 
infinitesimal of an order higher than the product AaAh. In fact, none of the 



variables U, V and IF can exceed the area of the rectangle of which it forma part, 
that is, 

0 < U < bAa, 

(11) 0 < F < aAb, 

0 < H'' < AaAh. 

It follows that 

0 < E(UW) < b{Aa)'‘Ab, 

(12) 0 < E(VW} < aAa{Ab)\ 

0 < RiW"') < {AaAb)\ 

Hence, from (9), (10) and (12) 

(13) f)(a, M s) - 2 lim “ mYW) + EilJV) 1 • 

We now reduce the calculation of (13) to finite form by approximating to the 
infinite sets F, U, V, W by progressively more ample but finite seta. To do so, 






TWO-DIMENSIONAL RANDOM SET 


333 


we cover R' by progressively more ample ljut finite networks of points. More 
precisely: consider a rectangular system of axes Of and Ot) oriented as in Figure 1 
so that the axes are common boundaries oi a K b = R and of the rectangles ob¬ 
tained by increasing a and 6 Let 


(14) dn = a/(n + 1 ), = 6 /(n + 1 ). 

Consider the lattice of points (ij) with coordinates 

(15) = id„, 

for i = -!){">, -f 1 , • ■ ■ , 0 , 1 , 2, • ■ ,n\j = -dI"’ + 1 , • ■ ■ , 

0 , 1 , 2 , ■ • , n, where uj"' and are the greatest integers such that 

(16) < Aa 


and 


(17.) < Ab 

To simplify the writing, the superscripts (n) will henceforth be dropped. 

"With every point (ij) we associate a random variable x,, defined as follows 
If in the course of the trials contemplated none of the rectangles p covers (ij), 
then = 1 . Otherwise x,, = 0 . Further, write 

Fn = dnSn S 2^ , 

i-O 1-0 


(18) 


0 n 

Un ~ dn^n ^1} ) 

i-i—wi ;-0 

n 0 

"^n “ dji^n ^ 

i —0 7 —vj 


•—Pi J—V2 


Now the boundary of the set E, for a fixed s, consists of one or more polygons 
having a fimte total number of aides each of bound^ddength. It follows that, 
given any « > 0 , there exists, for a fixed s, a number Niis) such that n > Nt{s) 
implies that 


(19) 


|F„- F| < ., 


^vith similar inequalities relating to [/„ , 7„ and W„ . Hence it follows imme¬ 
diately that 


hm F(7„T7„|s) = .B(FFls), 

n-*-ao 

lim £(ir„7„|s) = B{UV\s). 


(20) 



334 


J. HKONOWSKI AND J NEVMAX 


The expectations in formula (13) will therefore be obtained as limits of those 
on the left hand sides of (20). We have 

(21) Axn I s) = di X. T, E U,. 

(22) E(Vn Tn 1 8) = d\ sl i; i: 7^ (xa E E a-u 1 sV 

Hitherto we have made no assumptions concerning the values of Aa and Ai). 
Since these are to tend to zero, we may a.ssume that 


(23) 


0 < Aa < T — a/2, 
0 < Ab <y - |3/2. 


On tWs assumption, we shall now compute the expectations of the type 
E{xi,Xki I s), of which (21) and (22) are linear combinations. 

Since the variables Xis and Xki are capable only of the two values unity and 
zero, the expectation of thc'r product is simply the probability that both of them 
are equal to unity, i.e the probability that both points (ij) and {kl) are "missed” 
by all the s rectangles p falling on R'. This probability may have one of two 
forma. If both 


(24) I f — fc I < ot and in | j — M < d. 
then 

(25) E{x<ixk, 18) = |i -A"JA.ri.Uy; 

while otherwise 

(26) S(x.vXh|s) =(^1 - 

in each case, in virtue of the assumption (ii) of Section 1. 

The essential content of equations (24) to (26) is that, once tlie other variables 
appearing in them are assigned, E{x{,Xki j s) is a function only of the differences 
i — k and j — I It is tliis fact which allows us to evaluate the limits of the 
quantities m (21) and (22) in a simple manner, in effect by holding one of the 
two freely variable points {ij), {kl) in a fixed position, say at the origin. Thus, 
let 

(27) E{e„ 1 s) * dUn E H eUi±± Xk, 1 s). 

Owing to the remark just made, the expectation 

/ n+i n+j \ fun \ 

(28) Mx,j E 53 **1 1 s) = Slioo E E IS) 

\ k~\ l-]\ / \ 1-0 / 




TWO-DIMENSIONAL RANDOM SET 


335 


and it follows that 


(29) 


E(dv. I s) = (v, + 1) (i ;2 + 1) d\Sn Byxao 2 12 I 

\ fc-O 1-0 / 

= [(*^1 + 1) + l)diiS„] [dnSnS 2 E(XmXkl I s) 1. 

L l !--0 1-0 J 


Of the two factors in the square brackets in (29), the first tends to AoA5 as n 
tends to infinity, and the second tends to the integral 

/•<* />^ 

(30) 

where 

(31) 


JQ Jo 




if both 0 < S < a and 0 < t/ < /3, and 

(32) /(l,.;) = (l - I?) 

otherwise Thus the computation of the limit of | s) is straightforward 
It remains to show that it differs from that of E(YnW^n \ s) Id equation (21) by 
an infinitesimal which is of an order higher than the product AaAb. 

Since the variables xj* are capable only of the two values unity and zero the 
absolute value of the difference between the brackets in (21) and (27), that is, 
between 

n n w+ t n+j 

(33) x.j 2 2 and 2 2 a:*!, 

ImO k^\ 

cannot be greater than — n(x + j) < n{vi + Vi). It follows that 

(34) I E{Y„W„ I s) - E{d„ I s) I < [d„8„(t;i + l)(i ;2 -1- l)][nMid„ + ndnvM 
As n tends to infinity, the right hand side of (34) tends to the product 


(35) 


AoAb[5Ao + aA6]; 


whence 


(36) 


hm -2—(lim B(fl„ I s)} 

^Cl^O n^oo 


lim 

Aor Ad—* 




A very similar procedure will serve to evaluate the limit of E{UV \ s)/AaAb. 
Here, we replace the two freely variable points (ij), (hi) by two semi-fixed points. 



336 


J. BKONOWSKr AND J. NKYMAN 


one being restricsted to the axis Of and the ottier to the axis Or, M 
instead of considering A'(!7„7„ ( s) in equation (22) we coiiifkr, 

(37) £(*.!«)-rfU; t teLifi ± 

and it is cosy to sec that 

(38) lim !£((/. K.|,) -mi^)l£0(Aaf(ii). 

SO that the quantity (37) may be use,d in equations (13) and 120^ in i r 

r-Tifi,,'''‘''-I’) 

and therefore 

m M*. I.) - (..(„, +1) I (i r. g g I .)j 

Further, and in the same way, we may replace the sum in (40), namely 

s <*•'s,.I 4-s-.w») 

by the simpler sum 

S £. S IV “ (t-* +1) i: t xo, I ,) 

*“0 \ ;~0 / 


(42) 


= (e> + 1 S E E(xkoXoj\s). 

B foUorn thit we may repl.ee the limit of B(U.V. | .) a. erpreeeed in (22) by 
(«) lim (i, (., + 1 ) j. (., + 1,1 L ^ ^ 1 

1 . t -0 1—0 I’ 

and this is easily found to be equal to 

/V({,il)dWv, 

where /(f, ,) is defined by the formulae (31) and (32) 

(:3“rsrh.tr£x “ 


Z)(a,5|s) “ 4jf jf r{^,r,)dUn. 


(46) 



TWO-DIMENSIONAL RANDOM SET 


337 


3. The forms of the derivative. Since the function /({,»; ) has two different 
forms (31) or (32) depending on the relationships between a, b, a and /3, it will 
be necessary to distinguish four different forms of the derivative (46), and of its 
integral. 

First, for values of a and b for which simultaneously 


(46) 


a < a and 6 < /9, 

the integrand in (45) has the form (31) for the whole region of integration. 

Hence the value of D{a, b 

1 s) in the region (46) is given by, say 


I>i = ^ 


(47) 

= 4 

■■ [ [ 9 ' (^ "t) 

a b 

where 



(48) 


_ 1 - It 

Qv'} ■^) — 1 

Next, 

when a ^ a but 

b < the integrand m (45) has the form determined 

by (31) 

only when 


(49) 


0<i<a, 0<v<b, 

whereas 

when 


(60) 


0<?7<6, 

the appropriate form is 

that determined by (32). Therefore here D{a, b | s) 

has the 

form, say, 


(61) 

Di = 4b(a 


Similarly, for 


(52) 


a < a but 6^/3, 

D{a, b 1 

s) is given by, 

say. 

(63) 

D 3 = 4a(b 


Finally, 

m the region in which simultaneously 

(54) 


0 ^ a and b S |3, 

D(a, h 

s) has the form, say. 

(55) 

Di = 4(ab 

- oc^)(l - 1 1 



338 


J. UnONOWHKI AND jr. NEYMAN 


4. The second moment of Y. W« liave now to determine m{a, h | s) for all 
non-negative values of a and b, from the equation 

The general solution of this equation is 


m(a, b I s) = J J I)(a, b | s) dadb A(a) -f 


(57) 


where A(a) and /i(b) are each functions of one variable. These functions are 
determined by the boundary conditions, namely 

(68) m(a, 01 s) - m(0, b |.) - . o, 

da ob 

which are a consequence of the inequality 0 < Y < ah. It is then easily found 
that the only solution m(a, b | s) satisfying (57) and (58) has the following four 
different forms, depending on the values of a and b. 

If a < a and b < d, then 

(59) m(a, b | s) = f f Di(x, y) dxdy = mi(o, b j s) (say). 

Jo Jo 

If a S a and b < d, then 

m{a, b 1 s) = niiia, b J s) + f [ Di(x, y) dxdy 

Jo Jo 


(60) 


« niiia, b j s) (say). 

If a < a and b ^ d) then 

m(a, b I s) = m,(a, d I s) -f j D,{x, y) dxdy 


m(a,b|a) =m.(a.d Is)-l-j[ 

= ma(o, bjs) (say). 
Finally, if o ^ a and b ^ di then 


(62) 


/»0 /»0 mh 

(a, b I a) = mi(Q:, d | a) + / / f> 2 (x, y) dxdy / / Dz{x, y) dxdy 

Jo Jq Jq J/} 

+ [ [ Dt{x, y) dxdy = nuia, b| s) (say). 

J a J^ 

The procedure used to evaluate the integrals (59) to (62) follows the same 
general pattern, and we shall confine ourselves to outlining it in one case, say (59). 
There 

m,(a, b I s) = / Dt(x, y) dxdy 

Jo Jo 

f>h 

(03) 


/•a -0 pa pp 

= 4 / / dxdy / / g'ii, r) dtdr 

Jq Jq Ja^* Jfi—v 

= ifdx f dt{f dy f 9‘{t, r) dr). 
Jo Jo—I l^JO J^—V J 



TWO-DIMENSIONAL HANDOM SET 


339 


Integrating the double integral in the braces by parts for y we get, say, 


(64) 


lit) = dy g’(t, t) dr = jT g‘(i, r) drj 

- [ yg‘{t, - y) dy, 

Jo 


whence, substituting — y = rin the last integral, 

lit) = b g’it, r) dr - f i^- T)g‘it, r)dT 
Jfi-b Ja-b 


(65) 


Jfi-b 


= f (r+b- /3)g‘(t, t) dr. 

Jfi-b 


Proceeding now in the same manner with the other double integration in (63), 
we conclude that 


( 66 ) 


pa pa 

miia, 6 I s) = 4 / dx I(t) dt = 4 (i a — a)I(i) di 

JQ a—x ‘'flt—o 

= 4 f dt f (t + a — a)iT + h - ^)g‘it, r) dr, 

•J or—a “/J—6 


where, throughout, git, t) is defined by (48). 

Formulae for nhia, i | s), msia, i | s) and rriiia, b | s) are obtained by a similar 
procedure. They may conveniently be summarized in the following single 
expression. Define a symbol [r] for any real number x by the equations 


(67) 

With this notation. 


[r] = X if X ^ 0 
[x] = 0 if X g 0. 

whatever be the relation between a, b, a and /3, we have 


m(a 


( 68 ) 


, l)|s)=4/' f (t + a - a)ir + b - P)<1 - 


2a)3 — ti\' 


didr 


+ {o^[6 - + b\a - a]^ - [a — ~ . 


We now allow s to take all values s = 0, 1, 2, • • • with probabilities P. given 
by the generating function (1). Then it follows, from the form of (68), that 

m(a, b) = 4 J^ (< + a - a)(T -f b - (3)4' - ^‘'^^ --'^didr 

+ la% - 4- bla - - [o - aflb - - f?)- 

On subtracting from this the squai-e of the first moment of F, which by (5) 
and (8) is 



340 


J. BBONOWSKI AND J. NBYMAN 


we obtain the variance o-* of Y. But the variance of Y is necessarily equal to the 
variance <rx of X, 


B. Particular cases, (i) = u". This is the ease, considered originally, 

in wixich the number s of centers of the rectangles p falling witliin R' is fixed. 
The explicit evaluation of the variance <rj depends in thi.s case on the evaluation 
of the integral 

L L «+“" +>'-«{(■- +1)' 

The evaluation is easy if one expands the binomial under the sign of the integral 
and integrate.s term by term. Each such integral is a product of two simple 
integrals. 

(ii) S^afu) = Poisson Case. Tills is the case where the probabilities 

P, that there are exactly s centers of rectangles p witliin /f' are given by the 
Poisson Law, P, ~ Substituting the expression of the probability 

generating function into (69), we obtain for this case 

m(a, h) = [“ f (/ + a - a)(T + t - /3) £ didr 

(71) •'Ifl-M *-0 Si 

+ - «]“ ~ [a- a]^[6 - 0 \^\. 

On performing the integration term by term, and contracting the iir.st term 
of the resulting infinite series into the second line of equation (71), we readily 
obtain the result 


(72) 


m{a, b) = 


V (X««‘ 

h s! 


a/3 

(i + w+2)' 


X 

X 


|(a + 2)o — a + [a — fl] 

|(s + 2)6 - /3 + [^ - h] 





where [a] continues to have the meaning defined by (67). In virtue of equations 
(7) and (8), however, the last term of the expression (72) is precisely the square 
of the first moment of Y when s is Poisson distributed. Hence, for s Poisson 
distributed, we have the expression for the variance of Y and of X, 


2 2 
(Ty =I (Tjc 


. (Xa^)*__ 

U sf (8+ 1)»(8+ 2)’ 

X |(fi + 2)a "• a + [a — a] | 

x|(s + 2)6 - ^ + ' &](l 


( 73 ) 



TWO-DIMENSIONAL RANDOM SET 


341 


(ill) 4 ' 3 (w.) = Contagious case. This is the case where the prob¬ 

abilities jPs that there are exactly s centers of rectangles p within R' are given by 
the contagious law of type A with two parameters^. The evaluation of the 
second moment of Y is made easy by noticing that the probability generating 
function appropriate to the contagious distribution may be expressed as 
a series in terms of the probability generating function of the Poisson Law 


(74) 


Mu) = E 

JkiO fc! 


_ -m Y' 

~ ^ A~I ITi ^ 
Aao ic' 


Thus the evaluation of the integral intervening in the formula for the second 
moment of Y is reduced in the present case to that of formula (71). 


6 . Remarks on other cases, (i) It may be of interest, in amplification of 
H. E. Robbins’ results, to exhibit the analogues of formulas ( 68 ), (69) and (73) 
in the one-dimensional case. Foi this case, then, if the interval a is embedded 
in a larger interval a', we obtain by siimlar methods beginning with the calcula- 

da 

(75) m(a | s) = 2 ^ (i + a — a)^l — ^ - 7 —^ di -f- [a — a]’' , 

whence 

(76) m(a) - 2 J ^ (i + a — a)^' dt [a — ^1 — ; 


in particular, if s is Poisson distributed, 


2 2 _ o - 2 «X 

(Tv = (Ty = 2-1 




(77) 


s! (3 + 1 )(s 4 - 2 ) 

X |(S -f- 2)0 — a + [o( — o] |. 


The close paiallel between these foimulas and those for two dimensions make it 
natural to conjecture analogous formulas for n dimensions, but we have not 
attempted to establish such formulas. 

(ii) For the evaluation of the higher moments of Y it may be useful to notice 
that precisely the same method as that described above leads to the conclusion 
that the derivative of the n-th non central moment of Y is 


(78) 


m„(,a, b) 
dadb 


hm 


1 

AaAb 


\nEiX”-^W) + nin - l)E(X"-“ t/F)). 


* J, NeymaNj '*On ft new clasa of coiitagiouB distributions*^, Afindls of Mdih Stdt , 
Vol 10 (1939) pp 35-57 



ON THE MEASURE OF A RANDOM SET. II 


By H E, Robbins 

Poslgradxiale School, U. S. Naval Academy 

1. Introduction. In a recent pairer' the author derived general formulas for 
the moments of the measure of any random st't X, and applied the formulas to 
find the mean and variance of a random sum of inkirvals on the line. In a 
subsequent paper* J. Bronowski and J, Ncyman, using other methods, found the 
variance when X is a random sum of rectangles in the plane, and raised the 
question of finding the variance when X is a random sum of n-dimensional 
intervals in n-space. This will be done in the present paper, independently of 
the work of Bronowski and Neyman, using the methods of fl). The correspond¬ 
ing problem for circles in the plane will idso be solved. 

2. n-dimensional intervals, N filed, Let the random set X be defined as 
follows. Let A,, tti (the range of the subscript i throughout tlfis paper will be 
from 1 to n) and J be fixed positive numbers such that o, < 2 5. Let R denote the 
n-dimensional interval consisting of all points (ii, • ■ • > x„) such that 0 < a:,' < 
A,, and let R' denote the larger interval for which — 5<i, <A,+ 5 (and also 
its measure n(A, -f 2 3)). Let a fixed numlrcr N of intervals with sides o; 
parallel to the axes be chosen independently, with the probability density func¬ 
tion for the center of each intorvul constant and equal to 1 /R' in R'. The set X 
is the intersection of the set-theorctical sum of the N intervals with R. The set 
Y consists of those points of R that do not belong to X. We have identically 

(1) X + F = B, 

where capital letters denote either sets or their measures. 

From (I), equation (16), we have 

(2) E{Y) = [ f p(xi , • • • , Xn)dxi ■■■ dx„, 

JQ Jo 

where, setting r = IIo,, we have 

(3) p{xi ,-■•,«„) = Pr{{xi , • • , Xn)(Y) = ■ 

Hence 

(4) E(Y) = 

iH. E. Robbins. *‘On the measure of a random aot," Annals of Math. Slat, Vol. 16 
(1944), pp. 70-74. We shall refer to this paper as (I). 

• J. Bronowski and J, Nbtman "On the variance of a random set ” Amah of 
Math, Stat, Vol, 16 (1946), pp, 330-341. We shall refer to tliis paper as (BN). 

342 



MEASURE OF A RANDOM SET 


343 


From (1) it follows that 


E(X) = r{i 




From (I), equation (21), we have 


(6) ' i I ' L 

■dxi • • ■ dxndyi ■ ■ ■ dy^, 

where 

(7) p(a:i , ••• ,Xn,yi, ,yn) = Pr(ixi , • • , a:n)eF and (i/i, ■ • • , j/™)*!')- 

It is clear from the symmetry of the problem that the distribution of Y will be 
unchanged if we assume that for all a:, < y,. Hence, since there are 2" possible 
sets of n inequalities each, we can write 

(8) S(n = 2" ■■■ / •• pdx,---dx^dy,---dy„. 

Jq Jq Jo *^0 


W e now introduce the new variables of integration 
(9) -u, = X,, V, = y, - X, 

for which 

/j^Q\ 3(ill) ' • • ; > I’l > • • > l^n) _ j 

^ d(Xi , • • • , a;„ , J/I , • • . , 1/n) 

In terms of the new variables we have 


(11) p = fivi , ■ ■ • , «n) = 


Equation (8) now becomes 


^1 - ^ if Vf > a, for 


some i, 


2r — II(a, — v<) 


Y 

) if Vi < Oi 


for all i. 


r^i '’i 

^’(7=*) = 2" j ■■■ j j ■ ■ fdui--- dundvi ■ ■ ■ dv„ 


= 2" f f fn(Ai — Vi) dvi • • • dvn . 

Jo Jo 

Let zi = min(a., A.). Then from (11) and (12) we obtain 

B(y”) = 2" j[ " • • • j[ ' ~ ~ n{At - vi) dai • • • dv, 

(13) + 2"^1 - J) {i * ' ■' i ' 

— jf •••• J II(Ai — Vi) dVi ■ • ■ dVn^ . 



344 


H. E. HOBBINS 


Let the symbol [j: 1, as in (BN), be (lefined by 


(14) 


Ja; if f > 0, 
|o if J < 0. 


In the integral in the first line of (13) we intrwluoe the new variables of integra¬ 
tion tu, = a, - e,, while in the two integrulH in the seeond line we introduce 
the variables s, = A, - i'.. The result is 



•Il(ie, + A, — 0,) diui • ■ • dwn 
-b (l - llIA^ ~ II(A; » (A. - a,]')). 


From (1) we see that = E{X^) - E\X) = J^(F*) - K^Y). Thus from (4) 
and (5) we have 



■ r ( 

\ 2r - Ilw.y 

* n] 

Ji-i-ti)' 

'■ li' " ) 


• n(!i\ 4- A, — 0.) divi ■ ■ ■ dWn 
-b illA? - n(A; - [A. - a,f)) 


3. n-dimensional intervals, N variable. Now let X and 1'be defined as before 
except that the number N is taken as a random variable, capable of assuming the 
values 0, 1, • • ■ with respective probabilities Po, pi, ■ • • , and with generating 
function 


(17) 


V(i) = • 

0 



MEASURE OP A RANDOM SET 


345 


Then from (5) we have 


(18) iim = ± p,r{i - (, - A)"| > b|, -, _ 1,)|, 


while from (15) we have 

4 = E{Y^) - E\Y) = 2" r ■■ r 



(19) n(w, + -4, — a.) drii ■ ■ ■ dw„ 

+ (i - J) (n4^ - n(4? - w. - a.n} - R^<p^ 

In particular, suppose that, as in {BN), N has a Poisson distribution with a 
parameter X, 


( 20 ) 


Py = e 


-XB' 


{XR')’' 
Ni ’ 


so that 

(21) ^(t) = 

Then (18) becomes 

(22) EiX) = R{1 - 
while (19) becomes 



■ {n(io, + A, — a,)) dwi ■ ■ • dwn 

+ (nA^ - n(Aj - [A. - o.]*)} - fl' . 


Integrating term by term and simplifying the resulting expression, we obtain 
finally 

2 _ „ . 2 " . y (__ 

r / A.V'^^ll 

•n|(iV^ + 2)A,-a* + [o. -4.]fl 

4. Circles in the plane. Let the random set X be defined as follows. Let 
Ai, Aa, a, and S be fixed positive numbers such that 2a < min (Ai, As, 2i). 
Let R denote the rectangle consisting of all points {xi , xj) such that 0 < Xi < Ai, 
0 < X 2 < Ai, and let R' denote the larger rectangle for which - 5 < Xi < Ai + 5, 
— fi<xj<As + 5. Let a fixed number N of circles with radii o and areas 
b = TO* be chosen independently, with the probability density function for 



340 


H. E. nOBBIJJS 


the center of each circle constant and equal to l/R' in W. The set X is the 
intersection of the set-theoretical sum of the N circles with R, The set Y con¬ 
sists of those points of R that do not l>plong to X. Equation (1) holds as before. 
The analogue of (4) is 

(25) E(F) » jf p(xi, Zj) dxi dxt * , 

while (8) becomes 

(26) JS(y^) “ 4 jf jf jf jf p(x, ,xt, pi, yz) dxi dxz dvi dyz, 
where 

(27) p(xi ,Zz,yi, Vs) = Pr((x, , X8)(F and (vi , yz)tY]. 

Introducing the new variables (9) we obtain the analogue of (12), 


(28) E(y’) » 4 f"’ r^Az - Vz)UU - Vi) dih dvz, 

Jo Jp 

where, settingr = (ui -f cj)*, 



Introducing polar cobrdinates r, 0 in the ai, vs-plane and carrying out tlie obvious 
integrations, we obtain 


P(F') = (l - jp* -h I a’ (Ai + Az) - Sa* - 45h:| 

(30) + 8o* f’ (rRt -f 4a* t* - 4a(Ai -f Az)r) 

Jo 

2b — 2a arceos i -f 2a* t\/1 — 

If now is a random variable with generating function (17), then (25) becomes 

(31) m) 
and hence 




( 32 ) 


js(X) = e|i-v,(i-A)|, 



MEASURE OP A RANOOM SET 


347 


E(X') - E\X) = E{Y^) - B\Y) 

v(l- 4- y a\Ax + Aa) - 8a‘ - 4^ 

— 22^ "H 8o* (ir72t + 4o* t* — A.(iiA\ 4- 

/. 2b — 2a arccos t 4" 2a* t ^/1 — 2® 

• ''(,1 - w -, 



SAMPLING FROM A CHANGING POPULATION"'" 


Hy Reinhoi.d Baku 
I’nivemly of Iltinoin 


1, Introduction. If, in Rampling a pwtain i)Oi)iiliition, it ih impoHmble to take 
more than one Bample at any given lime, anil if the population changes between 
any two samplcH, then we are confronted with the following mathematical situa¬ 
tion. For every’t, 0 < t < 1, there i.s given a distribution" (:= population) 
D{1). fjCt furthermore // be, for 0 < j < n, a number bctw'cen ij - l)/n and 
;/n; and assume that x, is a sample taken from the population D[l,). We denote 
by Tn the set of the numbers ti, • • • , /« and by 0{.T„) the sample consisting of 
the Xj ; and we assume, that 0(T„) is a random sample, i.e, that Xi, ■ ■ •, x„ are 
independent variables. The question arises to get information concerning the 
family D(t) from the sample oIt„), It is clearly hopelc.ss to try for information 
concerning an individual D{1) or even some D(l,) or the statistics that may be 
derived from them. But we may hope for information in the mean, if we assume 
that the family D{1) is in some sense continuous in t. To make this statement 
more precise we denote by ail) the average and by M,(t) the i-th moment of 
D{i) around its average. Wc assume then that a(() and ill,(t), for i < 8, exist 
and are continuous functions of I, and in section 7 we shall have to assume 
furthermore that a{t) and Ms(<) are functions of hounded variation. These 
hypotheses assure the existence of 


the mean average a = 



and the mean t-th moment M, 



for i < 8. Clearly we may hope for information concerning a and Mi from the 
random sample OiTfj. It is our object to discuss certain more or less W'ell 
known statistics of the sample 0{Tn), and to determine their stochastic limits\ 


1 Presented to the American Mathematical Society. Septombor 16, 1946. 

* Tlie author is indebted to Dr. E. L. Welker for checking the reaullB, in particular those 
rather obnoxious computations needed in ocetiona 0 and 7 wliich the author did not incor¬ 
porate into this paper. 

‘ It oonstitules a restriction of gonorakly that we consider finite closed intervals only. 
But it is no further loss in generality to use the interval from 0 to 1, and this choice certainly 
simplifies notations, 

1 Comparatively little will be assumed of these distributions. These properties will 
be enumerated in Section 2. 

^See [2] p 81 and the criterion 2,d, of section 2, 

348 



SAMPLING 


349 


As an illustration we mention the following results which will be obtained in the 
course of this investigation (among others);® 

n 

X = n ' ^ aJj converges stochastically to the mean average o; 

j-i 

n ^1 

= n X/ (aJj — xY converges stochastically to Ma + / (a{t) — a)^ di, 
j—1 Jo 

n—1 

d® = (2n)~* ^ (xj — x,+i)‘ converges stochastically to the mean variance Mj . 

i-i 

It is clear that Mj is the stochastic limit of s® if, and only if, a(i) is 'constant. 
If a(t) is not constant, then is not a consistent estimate’ of Mi , and will have 
to be rejected—at least for large n —in favor of d® which is always a consistent 
estimate of Mi. 

It was this last point that led us into this investigation. Recently the sta¬ 
tistic d“ has found much attention, and the question aiose as to why the statistic 
s* should be rejected in favor of d’. Reading the illuminating introduction of the 
fundamental paper [1], one sees that just such a situation as we have attempted 
to describe here in somewhat abstract terms has necessitated the use of d^ 
Consequently our result may be considered a theoretical justification for this 
procedure. 

Our other results will be discussed in their interrelation as they are obtained 
It should be noted that all our results concern themselves with stochastic con¬ 
vergence, and thus they justify the use of a sample function as an estimate of 
some statistical number only for sufficiently large size n of the sample Thus 
it is quite possible that for small n other functions provide better estimates. 
The, practical applicability of our results depends, therefore, on a criterion for n 
to be sufficiently large, and unfortunately such a criterion is not yet available. 

2. Notations and fundamental properties. We have not stated in the Intro¬ 
duction the hypotheses to which we subject the distributions under considera¬ 
tion. For our investigation we shall need only very few properties of distribu¬ 
tions. Thus we are going to enumerate now some properties of distributions 
which we are going to use, and we shall assume throughout that these properties 
are satisfied. As will be seen these hypotheses are rather weak and are satisfied 
by a large class of distributions. 

If X IS any stochastic variable, then we denote by E(z) its mathematical ex¬ 
pectation, and the only properties of stochastic variables that concern us are 
properties of their expectations. E(x) is a linear operation satisfying E(l) = 1, 

“ It should be noted that the stochastic limit of the following statistics would not be 
changed, if we substituted for the denominator n of s“ the denominator n — 1 which is often 
used, and if we allowed the summation in the expression for d‘ to range from 1 to n, defining 

Xi 

1 Wilks [2], p 133 • 



360 


REINHOIiD BAER 


If furthermore Xi, ■ ■ • , are independent variables, and if the function / 
depends on some of these variables whereas g depends only on the others, then 
BiSg) ~ Eif)E(g), and this property may serve as a definition of independence. 

As stated in the Introrluclion we are going to study a family D{i) of distribu¬ 
tions, for 0 < t < 1. If X is the stochastic variable of the distribution D(i) 
for some fixetl t, then we let 

a(0 E{x) and M,(,l) = E({x — ait))'). 

Wc shall assume throughout that tlie average a(l) and the variance Afi(0 exist 
for every t, and that o(i) and ilfj(i) are continuous funetions of t. Moreover, 
when discussing 1 < t < 4, we shall assume that every Mj(r) with j < 

2i is a continuous function of r. Thus we are sure that the mean average a 
and the mean variance Mi, as defined in the Introduction, always exist, and 
the mean f-th moment exists, whenever is a rontinuous function of t. 

Remark: If the mean i-th moment M,- exists for every i, then one may he 
tempted to consider os the mean of the family Z)(t) a distribution D with average 
0 and t-th moment M, , provided such a distribution exists. But this has to be 
done Avith some caution. For suppose that every I)(() is normal. Then M,{t) = 

0 for every odd i, implying Af. = 0 for odd i so that I) would be symmetric. 
But Mail) = 1'3 (2i — l)A/3(<)’ and hence Mi, = 1-3 ••• (2i — !)• 

/ Afa(i)‘ di, and the integral will bo the i-th power of Mt only if Miil) is con- 
Jo 

stant. Thus the mean distribution D of a continuous family of normal distribu¬ 
tions need not be normal. 

As in the Introduction we now let U be some numlier between (i* — l)/n and 
i/n, and denote by x, a sample taken from the distribution Dit,). We denote 
by Tn the set of the n numbers U and by OiTn) the sample consisting of the x,. 
It will be assumed throughout that OiT„) is a random sample, i.e. we shall 
assume that Xi, • • ■ , Xn arc independent variables. 

We are not going to make any use of the customary definition of stochastic 
convergence* (and we shall therefore not restate it). Instead we are going to 
apply throughout the following criterion’’ 

2.d. The function fiO(T„)) of the sample OiTn) converges stochastically to the 
number r, if 

lim EifiOiTn))) = r and lim Ei[fiOiT„)) - EifiOiT„)))f) - 0. 

n“*od n— 

All the sample functions considered will be polynomials of the variables 

> Wilka [2], p. 81. 

•Wilks [2], Theorem (A), p 134, 

i» The validity of criterion 2,d. implies atochastio convergence in the customary sense. 
Thus, all results obtained in the present paper remain valid also when the customary defini¬ 
tion of stochastic convergence is adopted. 



SAMPLING 


351 


3. The mean average. Though the discussion of this section is i ather obvious, 
we give the details, since they may serve as a convenient introduction to the type 
of argument we have to use throughout. 

Theorem. S converges stochastically to a. 

fi n 

Proof: We note first that E(£) = n~'^ ^ £ a(ij) Since Ij is 

I-l 3-1 

between (j — l)/n and j/n, and since is the length of this interval, it follows 
from the continuity of a(t) that 

/ a (t) dt = lim n ^ ^ a {t ,); 

"0 

and thus we have shown that E(x) tends to a as ft tends to infinity. 

Next we find that 

Ei{x - E{x)f) = J) 

= ^ E{{x, — a{t,)y) = n~^ ^ (<j), 

3-1 J-1 

since E({xj — a(t,))(xh — o,(th))) = E(xj — a{t,))E{xk — a{th)) = 0 for j h 
But M 2 {t) is, for 0 < / < 1, a bounded non-negative function, showing that 
E{{x — E{x)y) tends to 0 as n tends to infinity. Applying 2.d. we find that 
X converges stochastically to a, as we intended to show, 

Remark' It is clear that the speed of the stochastic convergence of x to a de¬ 
pends on two factors: 

(i) the goodness of x as an estimate of E{x)', 

n 

(ii) the speed of convergence of the sums n ^ a{tj) to the integral a = 

f a(t) dt. 

Jo 

It is this difficulty which expresses itself in (ii) and which makes the present 
type of statistical estimation less effective than the one concerned with sampling 
from one distribution only. As to (i), it is again, as may be seen from the proof, 
of the order of magnitude {Mi/rCy, (see Theorem 1, section 4) 

It is probable that x is a better estimate of E{x) than of a But this does not 
help, since the former depends on the particular choice of Tn 

4. The variance. Theorem 1 (S' converges stochastically to Mi. 

Proof: We note first that 

E{{x, - x,+0' = E[[{x, - ait,)) -f iait,) - ait,+,)) + (a(«Hi) " 2:3+1)]') 

= Miit,) -b (ait,) - ait,+i)y + Miit,+i), 

since Eiix, — ait,))ix,+i — aCij+i))) = Eix, — a(i,))E{x,+i — ait,+i)) — 0, 
£?(const) = const and £((xi — a(f,))') — M^iQ. Hence 

Eid^) = i2n)~\A + B - C), 



KKlN'HOLP MAKli 


3fi2 


« ii -1 

where A = 2 X) -Vs(/y), B ^ Wh) ~ C = M^ih) + Since 

;-I j-1 

Z/ IS a value between (j ~ l)/'n and j/n, anti sinee n ‘ is tlie length of this interval, 

it follows from the continuity of the function Miit) that Mi = / M^it) dt = 

Jo 

Urn (2ri) 'A, vSince is hounded aa a eontimious function, it follows that 

n~-*te 

{2n)"^(! tends to 0 as n tends to infinity, h'inully we infer from the continuity 
of tt(t) which is used here for the first lime to its full extent ---that there, exists 
to every given positive t an integer N = A'C*) surli that (a(<') — a{l")Y < e 
for i f' f" ! < (2.V) Thus for N(f) < n we liave {a{i,) a{t}+i)f < e and 

(2n)" (. lienee (2n) ’/f tends toO as n tends to infinity, and we have 

^11 

shown that 

tends to Mj as n tend.s to infinitj'. 

Next we note that 

E{{d^ - E{d^)f) = E{d'] - E(d"f 

“ (2nr i: ~ EHx, - x.^,f)Ei{x, - xy+O”)]. 

M 

But if both i and i + 1 are, different from j and j -h 1, then /t'((:r, — — 

X;.h)*) = £'((xi — x,+i)’)A’((Xj — Xj 4 i)*), and thus there are not more than 3n 
summands in the above summation that are not identically 0. These sum¬ 
mands, however, depend only on Ms(h), Ms(f*) and Mtih,), and they are 
therefore bounded. Thus E{{d^ — E{d^)f) is equal to (2n)'“* times a sum of 
not more than 3n summands which are bounded. Hence’ /f((d* ~ E{d^))‘^) tends 
to 0, as n tends to infinity. Now our theorem is an immediate consequence of 
the criterion 2.d. 

Theohem 2 . s* converges slochasltcally to Ms / (o(0 — a)“ dl. 

Jo 

n 

Proof; We note first that n(xy — iE) = ““ ^a) and that therefore 

A-l 

n 

s’* = n~®X) X2 (.^J - — u). SinceXi — x, = x, — a(Q ■+ a(i,) — a((j) - 

1-1 A.l 

(x, — o(fj)), we find a.s usual that 

E((x,' - x*)^) = Mi(ty) -f (aiO - a(a))' + Ms(l,), 
and if li fc we find that 


E((xf - xa)(xj - xO) = MiCfj) + {ail,) ~ o(tO)(a(f;) - a(4)). 
Consequently 

2 £((xy - XA)(Xy - Xu)) = n*Mi(fy) -f Z Ms{h) 

h,k A-1 


+ £ (flih) - aith)){aitj) - a{k)) 

h,}, 

— M2{tj) + 2 ^^2{th) + ["23 ' 


.A-l 



SAMPLING 


353 


Consequently 

B(s’) = n-^ E + n-^ Z mtk) + w"® E [E (a(i,) - a{h))l 

1-1 k-i 1-1 L'l'-i J 

As in the proof of Theorem 1 we see that the first of these sums tends to as 
n tends to infinity, and the second of these sums therefoie tends to 0 as u tends 
to infinity. The last sum equals 

E - O'iQia^ik) + a{tk)) + a(ift)fl(fi)] 

= rr'- E — 2n~^ E <i(t,)a{}h) + E o,{th)a{tj,) 

1—1 l,h h,ll‘ 

= n~'- E aikf - E , 

j-i L j=i -! 

and this expression tends to J a{t)^ df — |^ J* a(j) as n tends to infinity. 
But 

J a{tf df — ^ J a{t) dtj = j (a(t) — af dt, 

since a = f a(t) dl, and thus we have shown that E{s^) tends to 
Jo 

Mi + / {a{t) — ay dt as n tends to infinity. 

Jo 

If J, h, k, p, q, r are integers between 1 and n, we put 
0, K k, p, q, r) = E{{xj - Xh)(Xj — Xk){xj, — Xg){x^ — Xr)) 

- S((xj - x/i)(x, - Xk))E{{xp - x,)(xp - Xr)). 


If neither j, h nor k is equal to any of the three integers p, q, r, it follows from the 
independence of the variables x, that {j, h, k',p, q, r) = 0. Thus 

E{(s^ - Eis^))^) = E{s*) - Eisy = S'(j, h, k, p, q, r), 

where the summation is taken over all the values oij,h, k, p, g, r between 1 and 
n with the restriction that at least one of the three numbers j, h, k is equal to at 
least one of the three numbers p, q, r. This sum contains therefore not more than 
3V summands, and each of the summands is bounded, since they depend only on 
a(«,), MiiU), MiiU) and Mi{U). Thus B((s® - E{s^)y) is equal to n"® times a 
sum of not more than summands which are bounded. Hence E{{s — 
B(s“))0 tends to 0 as n tends to infinity. Now our theorem is an immediate 
consequence of the criterion 2.d. 

Noting that / (o(<) - a)^ dt is nothing but the variance of the function 
Jo 

a(t) (around its mean a), we obtain the following obvious consequence of Theo¬ 
rems 1 and 2. 



354 


RBINHOLD BAER 


Corollary: — ri' converges 8loch(uttically to the variance of a(t). 

Remarks similar to those made in connection with the proof of the theorem of 
section 3 may be made now in regard to the theorems of this section. 

n—1 

By similar argument.s it is po.s.sible to prove that the statistic n~^ ^ x,X{+i 

^ ''-I 

converges .stochastically to / a(t'/ dt. 

Jo 

n—I 

6. The third moment. Put d(3) « n~' (ly - xyi.i)’(a:y+i — xyi-a). Then 

d(3) is a function of the random sample 0(T„). 

I'heorem 1: d(3) converges slochaslicolly to M», 

Proof; It is readily seen that 

JS((xy — xy+i)’(a:y.n — Xy+i)) = "b (“(^y+i) ■“ 

+ (o(<j) ~ (o(ij) ~ a(^/+i))* + Mj(/y+i)), 

and in practically the same fashion as in the proof of Theorem 1 of section 4 one 
shows now that E(d (3)) tends to Mj aa n. tends to infinity. 

Furthermore we have 

E(m - Eid(Z)f) ». E(dm - £l(d(3))’ « n-’ E U, h), 

hh 

wliere 

y, h) = E{{x, ~ Xj+i)''(a:y+i “ “ Xah)’(xa+i - Xa-h)) 

- E{{Xj - Xy+i)*(xy+l " Xy+j))Z!l((zA - Xa+i)’(Xa+1 - xa+s)). 

Clearly {j, h) = 0 whenever j + 2<hoTh-{-2<j. Consequently there 
appear actually in the sum of all the (j, h) not more than 5n terms each of which 
is bounded by an absolute constant, since they depend only on a{t,), M^iU), 
Mi(U), Miiti), and Mtih). From this fact we infer as before that 

B((d(3) — E{di3)Y) tends to 0, as n tends to infinity, and our theorem is an 
immediate consequence of the criterion 2.d. 

Remark 1. If ilfs((), Mi(l) and a(t) are constant, it follows from the proof that 

E(d(3)) = t?—^ ; 

n 

n-2 , 

and thus (n ~ 2)“* E (xy — Xj+O'^Cxy+i — Xy+i) is an unbiased estimate of Mj. 
7^1 

Remark 2. One might be tempted to use instead of d(3) the following function: 

n“^ E (*/ - */+i)‘- 

7-1 

By an argument of a nature rather similar to the one used in the preceding proof 
one may show, however, that this statistic converges stochastically to 0. 



SAMPLING 


356 


Put s(3) = n (x, — x)^. Then s(3) is a function of the random sample 
0{T„). Furthermore let 


F, 


= a{i)M2{t) dt — aM^ — o, j f^{t) 


Theorem 2. s(3) converges stochastically to Ms + F^. 

n 

Proof: For fixed j, let X(j) = £ (a:, — a{t,) + a{tk) — Xk) and A{j) = 


h~i 


2 (a(ij) - aik)). Then 


A-l 


E{8{3)) = E Fi(XU) + Ai3)Y) 


i-i 


= n-‘ E [E(X(jf) + SA(j)E{X(jf) + AO)*], 

j-i 

since E{X(j)) is easily seen to be 0. We find furthermore that 
F{X{jf) = (n - lYMsit,) + B([E (a{k) - Xk)f) 

hr^t 

= ((n — 1)* + l)Ms(<,) — ^Msilk)i 

EiXijf) = (n - ifMiik) + E(£, iaik) - Xkf) 

= {{n — 1)* — l)Mt{tj) + E l^i{k). 

Consequently 

E(s(3)) = w”*r((n — 1)* — n + 1) E^s(^/) "I" ~ 1)2 A(j)M 2 (f;) 

L 

+ 3 E A(i) E l^t(k) + E A ijf~\ ■ 

jmml A—1 J—l J 

n 

Since furthermore E AO) = 2 (“(^) ~ o(<0) — Of 


7—1 




2 AUWiiQ =ni a{ti)M2iti) - 2 a(k) 2 

,_1 )-l A-l (-1 


and 



350 


KKINHOLI) BAKU 


it is ctmily vc'iilicd that tonds to .i/s + /''s, as n tarulH to infinity. 

To prove that /i((«(3) - tends to 0 a.s n tends to infinity, one 

proceeds as in the proofe of tlie preceding theorems, nainely by verifying that 
this expectation i.s n ** times a sum of not more than 4*n‘ suimnancl.s which arc 
bounded, since they depend only on a(t,l and on the for 1 < m < 7. 

The proof of tlie theorem may tlum he eompleted l>y applying thc' eritcrioii 2,d 
It is readily seen that Fi vanishes whenever a(/) is eoiistiuit. But fioni 






(a{0 — rti* fit 


we infer that Fs vanishes ton whenever is eonstant and nCf) is at tlic same 
time symmetric with regard to a, and more precisely: if Mi(l) is constant, a 
necessary and sufficient condition for the vanishing of f'j i.s the vanishing of the 
third moment of the function a(() around its mean. 'L’hus we see that rf(3) 
is always a consistent statistic for , though s(3) is not. 


6. The fourth moment. The re.sult.s in this section will be stated without 
proof. Their proofs can be constructed on exactly the same lines as the proofs 
in sections 4 and 5. 


R—1 


and 


(2n)“‘ S 2 Ui-i “ •t/l’fJiii - 

,-l ;-2 


R—I 

n”* (Xj-i - i - Xjf 

JZi 


converge slocJiashcaUy fo + 3 / Mi{lf di. 

Jo 

r«-i -jj 

(4n)'”' 53 (% ~ converges slochaslically to Mi + Ml, 

L/-1 J 

ri—2 j'l 

(4?r)~' 53 (^)-i ~ ~ ®j+s)* converges slochaslically io / dt. 

y -5 Jo 

From these facts one easily deduces that Mi is the stochasUc limil of 

and lhai f (MM) — dt is the stochasUc limit of 
Jo 

(2n)“‘r£ (xj - xjn)* ~ gig (xj - - g (xj-i - Xj)\xj+i - aij+s)’]. 


7. Efficiency. If / = f{0(T„)) is a function of the random sample 0(Tn), 
and if / converges stochastically to a number r, then 

lim nE((f ~ rf) 



SA,MPLING 


357 


may be considered as some sort of a measmefor the efficiency^' of the statistic 
/ as an estimate of r, provided, of course, the limit exists. 

Theorem 1 If the function a(t) zs of bounded vanahon, then 

lim nE({x — af) = . 


Proof- Clearly 

nE({x — af) = {x, — ^ 

= n“‘ ^ Miit,) + (a(<;) - a)j 

Now X) (a(i,) — a) = ait,) — na = ait,) — n / ait)dt 
3-1 I-l J-l L •>U-1)/n J 

Since ait) is a continuous function, there exists a number u, such that 

rl/n 

ij — l)/n < u, < j/n, and / ait)di = n~^ aiu,) 

•'(l-D/n 

Thus 


L iaiQ - a) = Z (“(i;) - a(w;)) 

j-i )-i 

But both t, and u, aie between (j ~ l)/i^ and j/n, and ait) is of bounded varia¬ 
tion. Hence there exists a constant A which depends on o(J) only and not on n 
or Tn such that 

(o(i,) — o)J < A for every choice of T„. 


The contention of our theorem is a fairly immediate consequence of these facts. 

This theorem and its proof may serve as an additional substantiation of the 
remarks appended to section 3. 

Remark: If we had assumed only the contmuity of ait) instead of its being 
of bounded variation, we could have tried to argue as follows: Since a(i) is con¬ 
tinuous, there exists to eveiy positive number e an integer N (<) such that | ait') — 
ait") 1 < € for \ t' — I" \ < Hence we would find that for Nit) < n 

we have 

ixj — o)J < nt, 

and this inequality is ceitainly insufficient for proving that the left side of the 
inequality tends to 0 as n tends to infinity. 

Theorem 2- If the functions ait) and MAf) are both of bounded variation, then 
lim nEiid^ - Mi)'') = M*. 

n-»« 



>‘Wilka [2], p 134/135. 

or a measure for the asymptotic variance of the function f 



368 


RKINHOLD BAKU 


Prook: In tho courKC cif tlu* proof of Theorem 1 of section 4 we have shown 
that E{d"^) = (,27if\A + B — C), where 

/I - 2 i: AhUXB - £' (a(t,) - - AUh) + M^U). 

/*l j-il 

Since MiU) ih bounded, it is clear that n *(’ lends to 0 as n tends to infinity. 
Since a(/) is of bounded variation, there exists a constant B* .such that B < B* 
for every choice of T„, and hence n tends to 0 us n tetuls to infinityFur¬ 
thermore we have 


n " P rl/w 

z MiUi) - nM, 2 - n . 

i-i i-i L 

Because of the continuity of iV/j(0 there exist numbers v, such that 

r)/'n 

{j — l)/n < vj < j/n, and Msfuj) = n / 

ht-D/n 

Consequently 

E - nM, - E - Mi(iq)]. 

1-1 


But Mi{t) is a function of bounded variation, and tlius we may infer, as in the 
proof of Theorem 1, that n'[(2n) '^1 — Mi] tends to 0 lus n tends to infinity. 
Combining all the facta we sec that n*[F(d’) — Mi] tends to 0 na n tends to in¬ 
finity, and hence we have shown that n[K{d^) - Mif tends to 0, os n tends to 
infinity. 

As in tho proof of Theorem 1 of section 4 we note next that 

E(d^) - E(dy = (2n)~’E (bi), 

♦i/ 

where (t,;) = E{ix, - x.+i)^.ry - x/+i)*) - /^((x. - xui)'^)E{{xi - x,+i)“), 
and that (i, j) = 0, if either t -f- 1 < j or j -t- 1 < f. Next we observe that 


ihj) = - o(0 + a(<.+i) - xt„)\xj ~ a{l,) -h 

— E{{x, — a(/,) -b a(f,+i) — x,+i)*)F((xj — a{ij) -f a{lj+i) — Xj+i)*) 

+ (a(<i) - a{li+i)){i,jy -b (o(<,) - a{lj+ 0 ){i, 3 y', 

where the expressions (i, j)' and (i, j)" are bounded (by a number independent 
of i, J, n or T). 

Consequently we have 

(i, i) = -b mi{U)Mi{U,,) -b - {MiiU) -b Mi{lni)f 

-b (a(/.() - a(f..(i))(b t)* 

= + MiiU+i) -b Miil,Y -b 

.- 2iMiil,) - Miil,+0f + (add - a(f..+0)(b i)*, 

where (i, i)* = (i, i)' -b (i, i)" is bounded by a bound independent of i, n, Tn . 

A remark similar to the one made just before stating Theorem 2 may be made here and 
below about the indispensability of the hypothesis that a(0 and MjCO be of bounded varia¬ 
tion. 



SAMPLING 


359 


Likewise we find that 


{i,i + 1) = + MS<+i) 

— + ^2{k+i)) (M2(<,+i) + Miifi+i)) 

+ {a{t,) — o(i.+i)) ({, i + 1)' + (a{k+i) — a(<,+ 2 )) (i, i + 1)" 

+ (a(<i) - a{U+i)){i,i + 1)' + (o(/<+i) - a(i,+2)) 0 i + 1)" 

Hence 


{i, i) + 2{i, ^ + 1) = M,{k) + 3M,(<,+0 + {MM 

— Af2(ii+i)) (3M2(i,+i) — ^ 2 (^ 1 )) + (a((,y 

— a(<,+x)) (i, t)'*' + (o(<i+0 

— <i(i<+2)) (t, i + 1) , 

where {^, i)'^ = (i, i)' + (t, 1 !)" + (i, i + 1)' ia bounded by a bound independent 
of z, n, T. Considering that 

Z {h3) = £ (i, *■) + 2 £ (t, f + 1), 


it is now deduced from the continuity of the functions a(t), M^it) and Mi{t) that 
n[E{d^) - Eid^] tends to M 4 , as n tends to infinity. We note finally that 
- M 2 )') = S((d' - E{d^)f) + (Eid^) - M 2 )', and the theorem is an im¬ 
mediate consequence of the facts we have deduced. 

Theorem 3. If the functions a{t) and Miit) are both of bounded variation, then 


lim nE({s^ — Mi - 

n-*w 

= M,- f Mi{tf 

Jo 


- f (a(0 - afdlf) 

•^0 1 

dt 4t [ {a{t)M 3 {t) — aMz)dt + 4 

Jo 


f 

Jo 


Miit) (a(0 - a^dt. 


Proof Since a(t) and Miit) are of bounded variation, we show—as in the 
proofs of the two preceding theorems—that 

^ fl(<j) — a),n*(n~'^ a(<,)'— f a{tf dt), and 
1-1 •'0 

n\n~^ Z) MiCk) — Mi) 
1-1 


all tend to 0, as n tends to infinity In the proof of Theorem 2 of section 4 we 
computed E{s^). Using this result we obtain; 


n‘(fi(s') - Mil (a (0 - afdt) 

Jo 

= n‘(n-' E Mid,) - Mi) -|- n-^n-^ Z Midf) 

j—1 J—1 

+ n\n-^E<kf - f a{tfdt) 

Jo 

+ n^(a‘ - 2 a(«i)"l ^ 



360 


I»,tNItOI,n HAKK 


wlvcro ono. slioukl n'lnctnhcr fh(* Hlcnfity / (a(l) - af ill - / a{tf dt ~ a°. 

Jo Jo 


But 




rr‘I] tt(0) 

J“1 


^ = n) (a - n ‘ g a(0)^(^a -f- n ' 


whpi'p flic last faclur un the ri^ht ia iKiiindcd Ity a hoiind indt'iH'ndcufc of n and 
7’n. Hence it follows that- 

Ms — jT (a(0 — af dlj tends to 0, ns n tends to infinity, 

By a computation of great length and little intercftt one shows that 
nE((s' - Eif)f) - n-‘ \(n ~ 1)* E + -la(n -1)1: 






- 4(a - 1) Z M,{0 E aih) + 2ri: Ms(t,)T 

,-l A-l Ll“l J 

- (n* - 2n + 3) Z -h -ln“ Z Ms(t,)a(i,)* 

;~1 /-I 

- 8n Z «(<j) D nih)M;ik) 


•4* 4 


z rt(f,)T z ms(/a) 


It is readily seen that this expression tends to 

M4 + 4 f' MaWoW dt - 4Maa - f'Mi(lf dt + 4 f M,(t)a(lf 
Jo Jo Jo 

— 8rt f (i(l)Mi(t) dt -h dfl'Mj, 
Jo 


' dt 


and now it is clear how to complete the proof of our theorem. 

Corollary 1 If a(t) is constant and Mi{l) of bounded vaiialion, then 

Urn ajB((s' - M^f) = M, - f Mt{tf dl. 

Jo 

This is an almost immediate consequence of Theorem 3, since a(/) = a, if 
ail) is constant. 

It has been shown in section 4 that is always a consistent estimate of Mt 
whereas is a consistent estimate of Mj if, and only if, ait) is constant. Theo¬ 
rem 1 and Corollary 1 offer a basis for comparing the efficiency of these two 
statistics. Since 


0 < Miitf < for every t 



SAMPLING 


361 


(apart from trivial exceptions), we infer from Theorem 1 and Corollary 1 the 
following fact. 

CoROLLAHY 2. If a{t) is constant and Mi{t) of hounded variation, then 

- Ms)*) M, 

and this expression is always positive and smaller than 1. 

Thus we may say roughly that for large n the estimate of Ms is more efficient 
than the estimate in case both may be used We do, liowever, not offer 
any information of the necessary size of n. Neither do we claim that for small 
n it might not happen that gives a good estimate and a poor one. 

REFERENCES 

[1] J. VON Neumann, R H Kent, H R. Bellinson and B. I Hast, “The mean square 

successive difference,” Annals o/ Math. Slot,, V. 12 (1941), pp. 153-162 

[2] S. S. Wilks, Mathematical Statistics, Princeton, N. J , 1943. 


!■* It has been pointed out before that s* la a consistent estimate of Mi if, and only if, 
a(l) 18 constant, and thus the efficiency of s’ and d’ as estimates of Ms may be compared only 
if a(f) is constant. 



TESTING THE HOMOGENEITY OF POISSON FREQUENCIES 

By Paul G Houu 
University of California at Los Angrlrs 

1, Introduction. Tlin .Htandard promhiro for t(*Kting tlio honiogoupity of a 
sot of k Poksou frequencies seems to fu' to apply the Poi.s.sou itulex of dispersion 
to those frequencies. The originalors of this procedure' [1] pointed out that this 
procedure, may he regarded Jis a x* test of goodne.ss of fit in whieli the Poi.sson 
frequencies constitute oh.Hcrved frecpiencies cone.spouding to k cells with equal 
expected values. Somewhat later it was .shown [2] that the ('orrc'.sponding like¬ 
lihood ratio test wa.s approxiniatel 3 ' ecpiivalent to the index of di.spersion test. 
Then the prohlem wa.s approached from the vh'wpoint of conditional variation 
[3], [4]. This approach permitted exact tests to be studied in .some detail for 
small sample.s. A few ycar.s later an exact te.st for the special case of = 2 
w'os introduced and .studied [5]. In this investigation consideration was given for 
the first time to the ellieiency of the proposed test. Tables of critical regions 
for the te.st and table.s for computing the power of the test corresponding to 
certain alternatives wen* muck' available. 

In spite, of the dcsirabU' features of this last, test, it still possi'sscs certain draw¬ 
backs. First, this test, as well h.s the others referred to, did not consider the, 
problem in whicdi the rate of occ'urrence of a rare evc'nt is constanf but for whicli 
the sampling units dificr in size.. For e.xample, these mi'thods were not designed 
to enable one to test whether a factory’s accident rule liad remained unchanged 
during the past month as compared with the preceding three- months. Kpcond, 
in order to use tliis test it is neceasary to possess the special tallies or charts ol 
critical regions construcled for the, te.st. 

In this paper a method which does not require, special tables is con.sidej’ed for 
dealing with these more general situations. In the course of the development 
it is shown that this method i.s, in a certain sense, the best method possible for 
testing the hypothesis of homogeneity against one .sided alternatives. Since this 
paper is principally concerned with lemoving the vmdcsiralile features of the 
method advocated in the last mentioned paper, it is advisable to read that paper 
in conjunction with this one. The procedure to 1 ic followed here wall be to derive 
a uniformly most powerful test, show that it is equivalent to a test, and then 
compare it w'lth the previously mentioned tc.st, 

2. Similar regions- In the following two sections a study will he made, of the. 
efficiency of a generalization of the critical region proposed m [5]. For this 
purpose let a: and y represent sample frequonchis from two independent Poisson 
distributions with means m* and nig , The probability of obtaining this sample 
is given by 


(1) 


x\ 

362 


2/1 



POISSON FREQUENCIES 


363 


Following the notation and procedui-e given in [5], let 


M = m* + , 




n = X y 


Then algebraic manipulation will show that P{x, y) reduces to 


Pix, y) = 


' ’ n\ x\{n- x)\^ '■ 

The hypothesis which it is desired to test is that 


p^(l - p)"- 


where r has been specified. The value of r will often be the ratio of the sizes of 
the two populations under consideration or the ratio of the time units of the two 
samples In many situations the alternatives to (4) which are of mterest will 
be one-sided. For example, after a factory has instituted a safety campaign, 
it would be of interest to see if the rate was unaffected as agamst the possibility 
of the rate having decreased; hence the alternatives to (4) would be 


(5) —” < r. 

In terms of the parameters introduced m (2), the hypothesis (4) and its alterna¬ 
tives (5) become 

( 6 ) p = r-^ and p > 7 - 7 —. 

l-fr 

Consider the probability given by (3) in much the same manner as was done 
in [5] This probability depends upon two parameters, y and p, only the latter 
of which IS specified by the hypothesis; consequent’y if critical regions inde¬ 
pendent of y are desired, it will be necessary to find similar regions [ 6 ] with respect 
to y. Since x and y are discrete variables, it is not possible to find similar re¬ 
gions of arbitrary size, consequently it will be necessary to introduce continuous 
approximating functions if such regions are desired and if best critical regions 
are to be found Toward this end consider the expression for P{x, y) in (3) 
It states that the probability that x and y will take on specified values is the 
Poisson probability that the sample point will fall on the line x + y = n, multi¬ 
plied by the binomial conditional probability that the point will have the specified 
X coordinate when the point is known to lie on this line. If p and n are not small, 
\ this binomial function could be approximated well by means of a normal function. 

) Or, if desired, factorials could be replaced by corresponding gamma functions 
j and the necessary normalizing factor introduced. Regardless of what con¬ 
tinuous function is chosen, a region on each line x p = n (n = 0 , 1 , 2 , ■ • ) 
can be selected such that the conditional probability for this approximating 
function is a. that a point on that line will he in that region Most natural 
approximating functions would become trivial for n = 0 ; therefore it may be 



364 


PAUL a. llOKJj 


neccHSiiry to choose an artificial function for this case or to adopt a convention 
of letting the origin be the critical region for this case but accepting only 100a 
percent of samples for which n = 0 as belonging to this critical region The 
totality of such a regions will constitute a critical region of size a which is inde¬ 
pendent of fi because from (3) the probability of a point lying in this critical region 
would now be given by 




r V _" 

n! 


« 2 


u n 

c 


n~D n! 


= «. 


Thus, similar regions with respect to g of sizi' a can be obtained by selecting 
regions of size a on each line x + y ~ n. 

The preceding method for obtaining similar regions is the only method for 
doing BO if such regions are restricted to be found on the lines x + y = n, because 
if a region of aizeanivere selected on each line a: + y — n, it would be necessary that 


giv 

fl -0 a' 


• a, = a 


independent of g. This is ecpiivalent to reejuiring that 


c 




n 

'y «ii g^ 

^ a n^' 


but since the power series for is unique, it follows that a„ — a. 


3. Common best critical region. Among these siinilar regions there will exist 
a best critical region for testing tlie liypothcsis p = po against the single alterna¬ 
tive p = Pi if there exist best critical regions on each line x y = n. I'Vom (G) 
it will be observed that this formulation is equivalent to testing the hypothesis 
T = To against the single alternative r = rt. The best critical region [6] on such 
a line, if it exists, will be that region which satisfies the inequality 


(7) 


f(x; po) 
pi) 


< k, 


where f denotes the continuous function selected to approximate the binomial 
distribution on this line and A: is a constant determined so that the probability, 
under the hypothesis p = po, will be a that a point on this line will lie in this 
region. If the normal approximating function with m = np and cr“ — npq is 
used, (7) becomes 


( 8 ) 


y po!/o 


-ir<»- 


PliT" 


(*--nPo).'l 
iVmo'J < jt. 


After completing the square in x, it will be found that this inequality reduces to 


(9) 


Cn 


lU 




nd/oi -l/to) -jg 

1/ pioi—iVji'oao J < 


C, 


where c is independent of x. 



POISSON FEEQUBNCIES 


365 


If xq is a value of x such that 
(10) P[x > a;o I p = po] = 

then (9) will hold for x > xa provided that pi > po • To demonstrate this fact, 
it is convenient to consider the three cases Po + pi ^ 1 separately. If po + 
Pi > 1, 

--->0 — 0 --- — - — 

9o ’ PiQi paqo ’ ffi go pigi pogo’ 

and therefore x^n^n(~ — “ j j (— — — j. Since the coefficient 

V?! 30//_ \Pt3i Pogo/ 

of the brackets in (9) which involves x is positive, increasing x will reduce the 
left side of (9). If po + Pi < 1, 

1 1 


and 


Pigi Pogo 


n(l/gi - l/go) 


< 0 


< 0 . 


1/pigi — l/pogo 

Since the coefficient is now negative, increasing x will reduce the left side of (9)- 
Finally, if po + pi = 1, (9) will reduce to 


Since 1/pi — 1/po < 0, increasing x will decrease the left side of this inequality. 
It therefore follows that the region defined by (10) is a best critical region for 
every alternative of the form pi > po on the line x y = n. The totality of 
such regions for n >' 0, together with the previously mentioned convention for 
ft = 0, then constitutes a common best critical region among all possible similar 
regions for testing the hypothesis (4) against the set of alternatives (5). 

In a similar maimer it will be found that if the inequality in (10) is reversed, 
the critical region so defined, together with the convention, will constitute a 
common best critical region for every alternative of the form pi < Po • If the 
alternative hypotheses consist of p po, there will not exist a common best 
critical region using these approximating functions. 

The critical region proposed m [5] is that for the special hypothesis po = h and 
the set of alternatives p po. It will be found that the lower half of this critical 
region for P = 2a will differ little, except for very small samples, from that given 
by (10) for this special case; however, it possesses the disadvantage of being 
numerical and therefore of requiring a special table. The critical region given 
by (10) does not possess this disadvantage. This fact will be demonstrated in 
the next section. 


4 . Chi-square test. Consider the problem of testing compatibility between 
observed and expected frequencies in two cells,. Let x and y represent the ob- 



360 


PAUL «. tIOEL 


served frequencieH and and the expected frequencies in a sample of size n. 
If the probability that an observation will fall in the first cell is, as in (6), p = 

then 


Cl = np 


^_±J 
i + r 


Pj = n(l — p) aa 


r(x + y) 

r+' 


The chi-square function for testing compatibility then reduces to 

(11) X - i: . 

.-1 e. r{y -}- s:) 



P’lOunB 1. 

Let xa be the value of x’ such that /■’[x* > xol = 2a for one degree of freedom. 
"With x° replaced by xo in (11), this equation determine.s a parabola in the x, y 
plane. If x y = nis not small, the probability of a point on the line x + y = n 
lying outside of this parabola will be approximatelj’’ 2a, the accuracy depending 
on the accuracy of the x'' approximation, and hence the probability of a point 
lying outside of and below this parabola will bo approximately a. Tims, a critical 
region for testing p ~ pa against p > po will bo given by that part of the positive 
X, y plane winch lies below this parabola. In Figure 1 the lower half of this 
parabola for the special case of pa = J is indicated by the symbol x'*- The critical 
region for the alternatives p < po woiihl be the r(>gion lying above the upper half 
of this same parabola, while the critical region for the alternatives p 7 ^ pn would 
consist of both of these, regions at the 2a level. For one degree of freedom, x 
has a standard normal distribution; consequently the critical region given by 
(11) is the same as that given by (10) in which a normal approximation is used 
























POISSON FREQUENCIES 


367 


on each line x + y = n This equivalence is easily verified by replacing y by 
n — X and r by g/p in ( 11 ) 

6 . Likelihood ratio test. The chi-square test of the preceding section yields 
a common best critical region for testing (4) against (5) for the normal approxi¬ 
mation. It is interesting to compare this critical region with that obtained by 
the maximum likelihood principle, which requires no such approximations. 
Consider, therefore, the two dimensional parameter space 

wii > 0, my > 0, 

and the subspace 

. _ 

OS' — = r. 

nix 

Maxiinizmg P m (1) over fi yields mr = x and niy = y. Maximizing P over w, 
treating P as a function of , yields rhx = x -{■ yj'^ r. Then the maximum 
likelihood ratio becomes 

_ max Po) _ _ \1 r _ 

maxPn x\y\ x\y\ 


This reduces to 



For a fixed value of X, this equation determines a curve in the x, y plane which 
may be used to determine a critical region Since —2 log X is known to possess 
an asymptotic chi-square distribution under certain conditions [7], choose as 
critical region that part of the positive a:, y plane lying below the curve determined 
by (12) when X has been replaced by Xo, where Xo is determined from —2 log Xo = 
Xo . This curve may be plotted by reducing it to the parametric form 

log Xo 

^ = -TXT-: 2/ = 

(1 + v) log -f- t; log - 

A comparison of the critical regions corresponding to (11), ( 12 ), and a slight 
modification of [ 6 ] for the special case of po = § and a = 05 is given in the accom¬ 
panying sketch. The modification of [5] consists in choosing a;o to be that integer 
which most nearly satisfies ( 10 ), rather than to be the smallest mteger for which 
the left side of (10) does not exceed a The latter md,hod of choosing a:o has a 
tendency to make the first type of erroi considerably smaller than a for small 
values of n. It will be observed that there are no appreciable differences between 
the maximum likelihood and chi-square critical regions Furthermore, it will 
be found that there are only two values of n, namely n = 3 and n = 9, for n < 30 


ki+i/ 





368 


PAUL 0, HOBL 


for which the chi-Bquare test and the modification of [fd niiKid yield different 
decisions at this flignificanci; level. 

The preceding sections show that the chi-square teat is highly satisfactory for 
testing the homogeneity of two Poisson frequencies, except po.Haibly for very 
small frequencies, and that therefore special numerical tables are, not nec(;.ssary. 


6. Several Poisson frequencies. The generalization of (11) for a set of k 
frequencies is, of course, the ordinary chi-sciuare function 


(13) 


t«.t upt 


h 

where n ■= X) *> < P* proportional to the sampling unit from which i, was 

(-1 


k 

obtained, and 53 “ !• Poisson index of dispi'rsion is merely a special 


I"*! 


case of (13) when Pi = 1 /k. The adequacy of (13) for this special case has been 
studied elsewhere [3], [8j, while studies of (13) in general are numerous and well 
known. 


UEPEIIKNCES 

111 B. A, PiBHER, 11, G Tkoiinton, and W, a Macke.szie, "TIk' nrouracy of tlio plating 
method of cHlimating the denaity of bacterial populationa,'' Aniwfs of Appbed 
Biology, Vol. 9 (1922), pp. 326-359. 

12] P, V. SuKiiATMK, “The problem of k aamploa for Poiason population'*. Proceeding! of 
the Nalional InsUluU of Scienm of India, Vol, 3 (1937), pp, 297-306. 

[31 W, 0. Cochran, “The chi-aquaro dialribution for the binomial and Poisaoa acriea 
with email expectationfl", Annalt of Eugenia, Vol. 7 (1930), pp, 207-217. 

[4] M. S. Bartlett, “PropertlBs of eufficiency and etatietical toets,’’ Eoy, Eoc. Proc., 
Seriea A, Vol. 100 (1937), pp 26S-282. 

[6] J. PaiYBOROWBKi AND H. WiLENSKi, "Homogeneity of reaalta m testing aamplos from 
PoiSBOn series," Biomelrika, Vol. 31 (1939), pp. 313-323, 

[6] J, Neyman and E. S. Pearson, "On the problem of the most efficient tostfl of statistical 
hypotheses," Roy, Soc. Phil Tram,, Vol. 231 (1933), pp. 289-337. 

(71 S. 8. Wilks, "The largo-sample distribution of the likelihood ratio for testing com" 
posite hypotheses," AnnaU of Ualh SlaL, Vol. 9 (1938), pp. 60-02. 

[8] P. G, Hoel, "On indices of dispersion," Annak of Math. Slut,, Vol. 14 (1943), pp. 
166-162. 



SOME COMBINATORIAL FORMULAS ON MATHEMATICAL 

EXPECTATION 


By L. C. Hsu 

National Southwest Associated University, Kunming, China 

The main problem considered here may be stated as follows: 

Let fi{x), ■ ■ , f„{x) be n polynomials. It is the purpose of this paper to 
establish formulas concerning the mathematical expectation (probable value) 
of the product 

Mxi) • • • fn{x„), 

where xi, ■ ■ ■ ,Xn are positive random variables and the sum of these is supposed 
known. 

Before establishing the formulas let us introduce some notations for con¬ 
venience. 

1. Notation. (A) In this paper the notation (m; fc; Si, • • , x„) oi {m‘, h, x) 
is used to denote that a set of numbers (xi, • • , a:„) is over all different composi¬ 
tions of m into n parts with each x ^ k, i.e. over all different mteger solutions of 
the equation si -f ••■ + »« = m with each a: ^ fc 

(B) Let m, 5 be two positive real numbers. The notation E{m, 5, [/i] • • • [A]) 
denotes the mathematical expectation of the product /i(a:i) • • fnix^) in which 
the sum m = xi+ • • • + Sn is known and for every x,{y = 1, ■ ■ • , n) the value 
of x^/5 IS a positive integer. The notation E{m, 5, [/i] • • • [/n]) thus implies that 
the value of m is a multiple of S. We call the S a “varying unit”, i.e. the least 
possible difference between two different quantities x, and Xji 7 ^ j. The nota¬ 
tion E{m5, [/]") is merely a special case that denotes the mathematical expecta¬ 
tion of the product /i(xi) • ■ /n(a:») under the known conditions 

/i = ■ />.=/. xi -I- • ■ + x„ = m, j = ^ 1, 

{v = 1, ■■■ ,n), 

where [ ] represents “integral part of”. 

(C) In order to simplify our formulas we always denote /(x) by/,!-!-••• 
+ /», by Li . ,, and l.pi + ■ • ■ + k.pi by aip) or tr. It is a convention that 

= 0 for m < n. 

2. Lemmas. Lemma 1. Lei m, n , • • ■ , r„ be non-negative integers. Then 

y = ( m + n - I 

r -1 \rj \ri + - • • -h -f n - 1 
369 



(1) 



370 


L. C. IISIT 


PuooF: The lemma follows inunofiiately hy considoririK the ccwflicient of the 
term on l)oth sides of 


f ^ Y"" _ / 1 Vi-t ■ 

•■■[i-xj “Vi-V 

Lemma 2, Let a, b, c, • he. any consUiids, and ki , h , ks , ' ■ ■ any poHiive 
integers. Then 

(n.o.aTAY, ) + Ph'i + yki + ■ • • H- 71. — 1/ a! /?! y! 

Pkoof; Expanding the left-hand side of ( 2 ) we see that the eoeflicicnt of the 
term a'^b^c' • • ■ is equal to 

«!i3!7! (mV.) W ■ ‘ Yi/ ‘ ■ U'a A / ’ ' ’ \ ' 

By Lemma 1 it befiomes 



„ .!LL. / 771. -|- w — 1 

a!/9l7! \aki -f fikt -j- yki -j- • • ■ +71 — 1 


Hence the lemma. 

Lemma 3. Let m, 7i(S m) he two positive integers. Then, for any given poly¬ 
nomial f{x) of the kth degree, we have 


(3) 


Z fM • • • f(Xn) 

("a;*) 


Tt! 


^ /th + n - iWj [//- 1)';^: 
(njO.p) \ O' **1- 71 1/ y„Q 


wheref^^ - /(*), « = a{p) = l.pi + • • • + fcp*.. 

Proof: Since/(a:) is a polynomial of the Alh degree, there exist (fc + 1 ) values 
/3jb, ■ • • , jSo such that 



= fix). 


By putting a: = 0,1, • ■ , fc, it is orderly determined that 

■ (i)^"~” + ... + (-1)' Q/® = (/ _ D'^', (. = 0,1, ■ • ■. k). 

The lemma is thus obtained by ( 2 ). 

For convenience we denote the summation 23 (»i; 1; a:) fiixi) • • ■ fnix„) by 

(m,l;*) 

[/i] • • - [/„]). Thus the formula (3) can be written as 

S(m,[/r)-nl E ■ 

(nlolp) + R' 1/ 1—0 p,,! 



COMBINATORIAL FORMULAS 


371 


IjEmma 4 Leifi(x), ' • hen given polynomials Then 

(4) 6'(m, (/J • [/J) = i E [/,. + ■ ■ +Un 

n\ (VI 

1 n 

^\herc (vi • ■ ■ vi.) runs over all diJferenL combinations out of (1 • ■ ■ n), k = I, 
■■■ ,n. 

Proof ■ The proof depends essentially on the formal logic theorem. Con¬ 
sidering a typical term 

--—j 5(m, [fyj"'- ■ • • [/.,]"')> 1 < t < n, ( 7 i + • + g, = n, 

we see that it is contained in the last (n — t i) summations of the nghthand 
side of (4), i.e in the summations (n vk) as k = t, i -i- 1, ■ ■ ■ , n. The num¬ 
ber of occurrences of the term in the right-hand side of (4) is therefore 

— A _ 0 if t>n 
V ) 1 if i = 71. 

The term vanishes generally except when gi = • • = gj = 1. Hence the right- 
hand side gives 

Sijn, [/i] • • [/„]) 


L (-ly 


3. Theorems with formulas. In the following statements of theorems and 
corollaries, the notation (xi • ■ x„) is always to denote a set of undetermined 
quantities, though the kind of the quantities of the set is stated. 

Theorem 1 . Lei (xi ■ ■ xj he a set of natural numbers under a known condition 
... -j- Xn = m. Then, for any given polynomial f{x) of the kth degree, we have 


(5) 


Fj(jn, 1 , [/]”) 


n' y-i /7n-|-n — iAA [(/ — 1)1^" 

/ra — l\ (n,o.j)) \o- -p 71 — 1/ i.=o p,l 

[n-lj 


Proof . Let m' = m ~h nr. By lemma 1 we then have 

^ {xi\ f^n\ ^ - A 

M\0j‘ ‘\ 0 / \ n-k /■ 


This is the number of compositions of m' into n parts with each part ^ r. In 
particular, for r = 1 we see that the number of compositions of m into n parts is 

Thus by the definition of mathematical expectation, the required 
value is equal to 



S{m, [/]”) 
S{m, [1]") ’ 




The theorem is therefore proved by Lemma 3. 



372 


L. c. Hstr 


Coeollary 1. Let (zi - ■ • Xn) he a set of positive quantities, of which the vary¬ 
ing unit IS S, and the sum is m. Then, for any given polynomial /(a;) of the kth 
degree, we have 


fm 


(6) E{m, a. [/D 


where 


fm 

J 

\n 


: + n~l 

I 

— (rtiO***) \(7- -p n — 1/ 

1 / 


fr [iq “ lY'T 
fi " pA 


g{x) ~ fidx), cr = Ipi -h • • • + kpk . 

Peook: It 18 deduced by the relation E{m, d, [/(a:)]") = E{m/b, 1 , [/(as)"). 
CoROLLABT 2. Let (xi ‘ ■ x„) be a set of non-negative real numbers under a 
known condition xi + ■■• + ~ m. Then, for any given polynomial f{x) = 

Oo + • ■ • + «***! haxe 


(n!)’ 




m 


(Olofl)' 


n (o' + R ~ 1)1 ?ijl 


(/riatr 
'"gll ’ 


(7) E{m, 0, [/]") 
where 

Ok 0, s ~ s(q) ~ ?i -f ' • + • 

Proof: The proof of the corollary drpendfi cHKciitially on the coneopt that two 
different real numbers may differ by an arbitrarily small numljer h. 

Let h bo an arbitrary positive nuinlrer and let fixh) = lYgix, h), where the 
number k ie the degree of /(*). Then, since 

0 if p > H 

V / \p if p “ n 

L (-1) “ /ft 4- l\ 

V 2 p « n + 1, 

we may write 

t (“])' gi*' “«,/») - + h-Rm, 

• mQ W 

where lim R,(h) *2 vlOf+i. 


*-*0 


Now we pass to the limit h 0, in which it is assumed that h runs through a se¬ 
quence of rational numbers of the form l/JV. Thus by OoroUiuy 2 wo have 

lira B(m, h, [/n «nl(n--1)1 Z ^ ^ fl . 

^-.o i»\ 0 \p) l<r -t- n 1;! p,\ 

Hence the corollary. 

It may be noted that this corollary can also be independently deduced by the 
proportion of the two integrals: 

/ Is ^ dxi dxn-i: J j dxi da;„-i, 



COMBINATORIAL FORMULAS 


373 


where the integrals are all taken over the region R: Xi + ■ • + = m, Xi > 

0, > 0 

Corollary 3 Let {xi • • • Xn) he a set of positive real numbers under a known 
condition a < a:i + • • ■ + *n < where a, h are non-negative numbefs. Then, 
for any given polynomial fix) = a, + ai,x’‘ (ak 0), the mathematical ex¬ 
pectation of the product fix,) • ■ which we denote by EUab), 0, [/]"), is given 

by the formula 

Eia, b), 0, [/]") = 

b — a 

( 8 ) 

V ■ y ^0 ^ ^ ^ (fcla,)«* 

(«,o,9) (1 + eiq))-in - 1 -f £r(g))lffo! fffc! 

Proof: Since the required mathematical expectation is the mean 

Corollary 3 follows from Corollary 2. 

On the other hand we see that 

hm Eia, a + h), 0, [/]") = Eia, 0, [/]"). 

h-.0 

Hence Corollary 2 can also be deduced from Corollary 3. 

Theorem 2. (First generalization of Theorem 1). Let fiix), • • • fnix) be n 
given polynomials, of which the highest degree is k. Then we have 


Eim,l,[f,]--‘{fn]) = L Z (-1)"“' 

(vi •♦K,) (n;0,p> 
l^a^n 


(9) 


X 


( m + n — l\ 
<r + n - 1/ 

(:: 1 ) 


)i-=o Pn ! 


where 

Proof: In the proof of theorem 1 we have seen that 

Eim, 1, [/]") = 1) ' [/]"). 

Thus, by similar reasoning and lemma 4, we have 

Eim, 1, [/i] ■ ■ • [/„]) = ^ Z ^ /"m — l\ 


,(m - iV 



374 


h. C. HSU 


The theorem is proved by lemma 3 
CoROU^RYl. Lei S be a varifvig unit. Then 


f/i] ‘' • t/nl) 


( 10 ) 


where 


£ E (-1)" ' 

t'l '•'•1 (n.O.I’) 


X 



II 


M'-O 


[(!/., ... - I)''**]'”'' 

' P.! 


J7.(a:) = friSx), = ff,, + ■ • ■ + (7.. . 

ProoI': By the relation Eim, &, f/i (x)) • • • [/„(x)]) = E (ra/i, 1, [/i(5i)] . • • 
[/n(5a:)]) we obtain the corollary. 

Corollary 2. For any poailivo re,al numher tn, toe have 


(11) S(m, 0, [x'’‘] . ■. b’’"]) 


pjJ • • • Pr>Kn — 1)! 


m 


nr... h;i„ 


(pi + • • • + Pn + n ~ 1)! 

Proof: Sincei, [fi] '••[/„]) = ^(“1)" Eim, &lfn ,,]"), we havei 
by lotting a “> 0, 

B(.m, 0, [/il ... fM) » E J ' /!-’ (»b 0. [/,. .„]"). 


The corollary is therefore deduced by (7). 

Theorkm 3. (Second generalization of Thcornm 1). LeL (.Ci • > • x„) be a set 
of inlegm wider known condiliona Xi + • • • + =* m, a < x, < b, where m, a, b 

are given integers. Then, for any given polynomial /(x), the mathematical expecta¬ 
tion of the product fixi) ■ ■ • /(x„), denoted by E (m, 1, |/]", is given by the formula 

(oh) 


( 12 ) 


where 


E (m, 1, (/]") 

Co,6) 


E(-i)'M.s'« mr 0 

>•-0 \k / 

T'-»-(:)(«:/)' 


ff(x) « /(b 4- x), /i(x) = /(a + X - 1) and in' ^ m - (a - l)n -f- (a - b - l)r. 


Proof: Define S{m, [/]") = 0 for ?r < n, and Eim, iff) - ^ , We 

1 for m = 0 

shall now prove that 

E (-!)'(”)SK, [,mr') - •£ 

.-I) / (il'.-Tn) 



COMBINATORIAL FORMULAS 


375 


where on the right-hand side of the expression the set {xi, ■ ■ x„) under the 
summation runs over all different compositions of m into n parts and 


a < X, < h, V = 1, ■ ■ , n, 

For convenience we denote the left-hand side of the expression by that is, 
y-O \v / 

= E (-1)' E Sim, [fix -h b)r) Sim' - m, [fix -j- a - 1)]-'). 

('“"0 / m—r 


Let fixi) ■ • ■ fix„) be a product term contained in i.e., £i -f • • ■ + = m‘, 
Xi a, ■ • ■ , Xn ^ a. We assume that i,, > 5 + 1, • ■ , > 6 + 1, where 
vi 7^ Vjii i 7^ j. Then it is seen that the number of occurrences of the product 
term in @ is given by 


E(-i) 


a^O 



< > 1 

if i = 0. 


Thus the product term /(ii) • • /(in) of © vanishes except when 
d ^ X, ^ bf V = 1, n. 


Hence we have 


@ = E fixi) • • • fiXn). 

a^x^b 


Next, we shall find the number of different compositions of m into n parts with 
each a < Xy < b, i.e., the number of product terms of ©. By the above result 
we see that the number is given by 


n 


EE(-i) 



1 E 


m 

i = E(-i) 


»»o 



Hence the theorem. 

This theorem shows that the mathematical expectation E (m, 1, [/]") can be 

(ob) 

expressed by and is therefore expressible in terms of linear combinations 

of the coefficients of the polynomial fix). 


Corollary 1 . Let & he a varying unit for which^ o’"® integers. 

Odd 


Then 


E im,s,[fix)r)= E h iyimr). 

(ob) ((o/J).(!>/«)) \o / 

Corollary 2. Let fiix), • ■ ■ fnix) be n given ‘polynomials. Then 

E im, 1, m • • [/„]) = E E im, 1, [f ,,. 

(ob) (ri f,) n\ (o,li) 



370 


h. HSli 


CoROLLAiiY 3. The number of integral solution.^ of (he equation + • • ■ + = 

m with ai < wi < bi an < x„ < bn is equal to 

'f! (-ir"-'*'- 

ii^n—0 

/m'+ n ~ (oi + ' ■ ■ + tto) + («i — 6i — I)**! -t- • ■ ■ + (ftn — bn — l)i'„ — l\ 

*V n~l )■ 

Proof: Wp have shown Unit, the number of integral Holution.s of the equation 
+ •' • + Xn = JR willi a < I, < 6 is given by 


Hence the number of integral solutions of the equation Xa + ■ ■ • + a:i„, + 
• • • + *.1 + • “ + x,n, = m with a, < x,n <b,, (y 1 • ■ • s, a = 1, • ■ • n,), 
is given by 



/m - (ai - l)jti - “(a, - l)nj \ 

, ( 4* (ni bi — I)**! + • • ’ + (o. — h, ~ l)r, — 1 I 

\ ni + • • • + n, — 1 / 

The corollary follows at once by putting Jti = • * • = n, = 1, « = rt 
This corollary can be restated in a more interesting manner as follows: 

Let there be n store rooms, and let hi, • • • , be the numbers of stocks con¬ 
tained in 1st, 2nd, ■ ■ ■ , n-th storerooms respectively. Then jn stocks contain¬ 
ing at least a,- stocks of the i-th storeroom (t = !,•••,«) can l>e chosen from 
these n storerooms in 



(“D 


ri 


( m 4" n + (oi ■“ hi — l)ri 4* • • • 

4" (On * bn — 1)^0 Ri ' 

n - 1 



different ways. 

So far we have established several combinatorial formulas concerning the 
mathematical expectation of the product fi(.ti) • ■ ■ f„(x„) under certain con¬ 
ditions. In the next section, we shall explain how to apply these formulas. 


4. Applications, (a) A criterion. In order to make the above formulas 
applicable to practical problems we state a criterion as follows; The mathernati- 




COMBINATORIAL FORMULAS 


377 


cal expectation of a function F(xi, ■ ■ ■ , x^) can be estimated by the above 
combinatorial formulas if and only if the sum of these undetermined quantities 
, • ■ • , Xn is known and there exist n polynomials /i(x), • , /„(x) such that 

F “/i , ''' , F o:/„ , where the quantities Xi, • • • , may or may not be conti¬ 
nuous. When the quantities are discontinuous, the varying unit is certainly 
given 

(b) Some approximations, For/(x) = ffo + , ■ + l3ix'‘(Pk ^ 0) we may write 

(/- 

«—0 

where ^1,., is a Stirling number of the second kind, as used by Jordan, and de¬ 
fined by 

Thus, the formulas (5) and (9) can be written as follows: 


B{m, 1, [/]") 
(50 


(m -f 71. — 1)1 (m — n)l n! (n — 1) ! 

(n;o;p) (m — <r)!(<r -t n — l)l(m — 1)1 

A (/3, S,,, + • ■ • + /Si Bf.t)"' 
■ H p.! 


(90 


i^(m, 1, [/j • • • f/j) = z z (-ir* 

(fl- ►,) (n,o;p) 

l^i^n 

(m + n — 1)! (ct — n) I Ttl (ti — 1)! (B, 3,,, + • ■ • -j- Bk Sit)”' 

(m — tr) 1 (o’ -|- 71 — 1)! (tr ~ 1) 1 >’-i> pd • 


where 


3,4 = ylS ,,, /< = |3,o + • • • + Bi = fti -b • ■ ■ + Pm ■ 

Now we state some convenient formulas concerning the number 3„ . 

If m is sufficiently large and t is .smaller than m, the following recurrence rela¬ 
tion is useful: 


+ lit + l)Xo + 2Xi] 

"b ■ ■ ■ + [(2t — 1)X(_2 + tX(_J ^ ^ , 


where X, = 1, X(_i = 0 and Xi, ■ • ■ , Xi-j are all independent of m. 



378 


X. P. tIHT' 


Starting from tlie first equality and uKing the recurrence relation iSnrtrj = 
W)S«,„ + Rm-i,n Kuccesaively we have 

m 

= J]) (wt “■ P ' 4 * l)jSm'-r 4 ‘t ,pn f I->■ 

■ § [§ (:+/;:)«+>+d+ g ("^ ri j ’) +» 

” § ^((i+/+ 2) (‘+'+■>+(,+/+1) (j+»] 

” ^ [(f + + (14- 1 ) ’ 

where X_i — X(_i == 0. The recurrence relation is thua deduced. 

Writing 


S, 




(:+‘)+KT; 2 )+-+^+t‘). 


and using the recurrence relation a.s obtained above, the coefTicicnts Xi, 
may be exhibited as follows; 


X(_i 


i 

1 

Xa 

Xj 

1 1 

1 X6 

Xe j 

X? 

Xs 

1 

2 

3 








3 

10 

5 



1 




4 

25 

106 

105 


1 




6 

66 

490 

1260 

946 


i 

1 



6 

119 

1918 

9450 

17326 

10395 

1 



7 

246 

6825 

50980 

190576 

270270 1 

135135 i 



8 

601 

22935 

302995 

1036635 

4099095 i 

4729725 ! 

2027025 


9 

1012 

74316 

1487200 

12122110 

47507460 1 

94594500 1 

91891800 

34459426 


Now let 

S„,„+, = ^ + Xt (1) (^^ 2 ) + • • ■ + X,-, (0 nl. 

The recurrence relation obtained above gives 
X,_i(i) = (2f - l)X,_,(t - 1) 
h^,{t) = 2(1 - 1)X,.,(( - 1) + (i - - 1). 


X(-i (0 


(2(1’ 

<I2' ' 


(-1 


X,-2(0 = (f - 1)1 2 : 2'-=''"’ 



Thus we obtain 



Let 


combinatorial formulas 


379 


m = 

^i2^Ax 


^ "I" ... / + i \ 

^ + V ’ ’ ” ’ \ 2 < - ly all less than 2t as n 

and since 



fn + t\ 


21 


n -t + 1 

- 4'g(0 

tt — < 4" 1 

_ 4‘iy(«) 


("2t') ® 






+ 


Vo(0 

n 

We may write (by Stirling’s formula) 

where t„ —> 0 as w oo. 

Now It IS easily proved that the inequality 

holds for every positive integer x. We have, therefore, 

''® ^ S V; I 4 /I* -3-;y; ((' - 1 ); 



a:- 1 

TT 



3S0 


and 

/2A“' t 

Using these inequalities we have 

1, = j Vi ft + 2)' < 4' Vi) < j Vp j ft' - 1) = ti,, 

where it may he noted that 


Hence we have in concluaion 


lim “ ** 1. 

Mb) 1/ 


(14) 




fc(0 4~ ^ + fn \ 
■■ ■ n / 


whore 





1 ). 


Evidently the formula (14) impliej (15) and (IG): 



t =*» 0(n‘”*), « > 0. 

(Stirling’s formula). 


REFERENCES 

[1] Paul S. DwrEn, "Combmod expanaiona," Annals of Math . itlat., Vol. 9 (1938), pp. 97-132. 
12] John Riohdan, '‘Moment recurrence relaliomf lor distrihullorifl," Annals of Math. 
5lal., Vol. 8 (1937), pp, 103-111. 

(3] K. MacMahon, Oombinalorial Analysis. Vol, 1, Cambridge Univerflity Presa, 1916. 

14) M. Josephine Rob, InterfuncUonal expressibilUy problems of symmetric functions, 1931 
(Privately printed). 



ON THE CONSTITUENT ITEMS OF THE REDUCTION AND THE 
REMAINDER IN THE METHOD OF LEAST SQUARES 

By S. Vajda. 

London 

1, Consider a set of variates y*, (i — 1, 2, ■ • • , n), which are normally and 
independently distributed with variance 1. Let also a matrix (a:,*) with i = 
1,2, • • • ,n\k = 1,2, • ■ • , sand ranks be given. Find&i, ■ • ■ , 6. in terms of y, 
so that 

{y> - a:,kbt)” 

t k 

is a minimum. This minimum value shall be denoted by . 

It is known (see e.g. R. A Pisher, "Applications of Student’s distribution”, 

Metron Vol, 5, Part 3 (1925)) that varies as does with n — s degrees of 

freedom and that it is possible to express ^min aa the sum of n — s squares of 

linear functions of the y ,. In the following lines Y Vi will be expressed as the 

% 

sum of n squares of such functions which are independent and of variance 1. 

The sum of the first s squares will equal Yy\~ V'min tmd therefore the remaining 

« 

n — 5 squares equal • 

Thus a simple way will be found of writing down explicitly the linear functions, 
whose existence only was proved by Professor Fisher in Meiron, 

2. We first calculate • 

=== 0, for I = 1, 2, • • • , s, gives the normal equations 
dbi 

t» n * 

( 1 ) Yl^iiyi = YYx, iXikhk, 

t>l i-l k-l 

which can be written 

n • 

(2) Y — Y 

1-1 1;-1 


with 

n 

Xlk = Y ■ 

»-i 

It follows from (1) that 


= Y~ YYY^iiXikhhk — Yy^i^ ~ YYXikiihk, 

,-l 1-1 i-1 k-1 1-1 1-1 1-1 


where the h are solutions of (1). 


381 



382 


8. VAJDA 


3. A second exprt'fmion for can 1 h* found a.s follows: 
Introducing 

c. « 12 

ic-1 


we obtain from (1) 

n w 

f3) I2AfC< “ 22x(,j/f, (J tt 2, ■ • • s). 

i»*l i<«-! 

Now if, (u = 8 + 1, ■ • • n), arc any w — s independent solutions of 

w 

22 ^,uT,t = 0, (i « 1,2. • ■ >, s), 

then the c, satisfy also 


(4) 22 2 it>=* 0, fi + li ■ • > n), 

|mI 

IjGt, such a set of z,u l)e chosen. Then (3) will be solved by 

n 

(6) c< « j/< - 22 

with X, ns indefinite factors and thcs(' ci satisfy (4), if 

ft n li n 

522o-t/i:=“ 22 (a » « + 1, ■ • • n), or J^ztuVi 


uv 


( 6 ) 


e<w| H !«•» 


with 


/Juv 



Because of (2) tho equation (A) can be transformed into 

V'min = S !/? “ 22 Z a:,iy.hi = 22y? ~ 2 2/(C( ~ s Z 

l“l l*>»l l—l »*-l I*-! 

which is, because of (0) 

(B) V'Ln =■ Z Z 


where the X are solutions of (6). 

The comparison of (A) and (B) gives 

ily'='tilXi,hh+ Z Z ^r„„x„x„ 

<-*l l""\ ^‘■1 U—l+J V—l + l 



LEAST SQTTABBS 


383 


where the first form on the r h.s. shows the reduction of X) yl by the method of 

1=1 

least squares and the second form constitutes the remainder. 

4. These two forms must now be expressed m terms of the y. 

We introduce the notations 


= Xn , = 


Zll Z)2 

Zai Z 22 


Z“’ = 


Zu • Zi3 

Zji ■ * Xgg 


and 


rj / 7 (a+ 2 ) 

Z/ — , Zf = 

I Za+ig+lZg+iB+i 

It is well known (and can easily be verified) that 


.'«+] sfl.^s+li+2 


etc 


S £ Xikhihk — yO) (5^11^1 + • • • + Xuh,) 

1-1 Ai-l .A 


+ Z«) 


' z® ( 


ZuZl2 

Z 21 Z 22 


+ 


X' 

Z 2 , 


iiXiJ Y 

2lZ26r7 


+ • • • + 0-1) J^(») 


which may be written 


1 /■ V 1 

'^) ( S Zubfcj + 


Z 2 ] ^^Xikbk 

k-l 






ZuZi2 ■ ■ • Z Zu&i 


Xg-iXsl '^Xskik 


Using (2), this can be expressed m terms of the yi instead of &*, as follows: 

I n I 

Zll 


J_ Y m_L 

v(l) V aj.il/i I + x 

.A \f»l / A -Z 


( 2 ) 


Z 21 Zt ,2 y. 


( 7 ) 


+ • + 


1 




ZuZi2 • 

• 2a:.iy. 

t=l 

Z,:Z.2 

n 

■ Z 2/. 

t=l 



3H4 


8. VAJDA 


Similarly hy (6) the acwond form can Ik; tranafornufl into 



The rank of (r,») in s, ho that the order of the HiifficcH can iilvvayH I>r chosen 
HO as to make tht^ above cltmominatorH different from wro. 

Thus both the reduction and fbe remainder have l«’en expresst^d by suma of 
squares, wlrose numbers eorrespond to the ‘'degrees uf frenJcirn” s and n - s 
reapeetively. 

6. It remains to >tc shown that the linear functions of the p, appearing in each 
form are mutually orthogonal and that in every one of them the sums of the 
squares of the coefficients are unity. 

n 

No\vifwccallthenUnearformawhi(‘hoccurahow22ac,j/^,(i js! 1,2, ,n), 

j-i 

then our proof implies that 

<"*1L?*** -I 

This is on identity for any y<, hence we must have 

A 

22 Htj oa ® 1 if j “ k, and 
v-i 

« 0 if j 9^ k. 

We have thus shown that the matrix («</) is orthogonal and it follows that 

A 

S <**< "* i ii j «« k and 
« 0 if j k. 

6. In practical applications the Xik will be given and if the expression (7) or 
(8) is to be written down we must first solve the set of equations 

n 

S ^*0* *<» "0) (i *“ 1, 2, • ■ • , a), 

|«mI 

We may assume that 

»U Jill 

^ 0. 

' ■ ' S'*! 






LEAST SQUARES 


385 


There exist, of course, an infinity of solutions. A very simple one can be found 
if the matrix (xik) is completed into a square matrix by adding 1 in the diagonal 
places and 0 elsewhere "We obtain 


Xn ■ ■ 

' • X,i 

X«+l 1 • 

* * ^na 

Xi, ■ 


^a+1 a 

‘ ' iCna 

0 • 

■ 0 

1 • 

•• 0 

0 • 

■ 0 

0 ■ 

• 1 


The minors of the terms of any of the s + 1th, • • • nth line give one of n — s 
independent sets of solutions for the 2 „, 

If, e.g. s = 1, then the are 

— Xn a:u 0 0 ■ 

-aiji 0 Xn 0 • 

-Xu 0 0 Xu •• 


etc. 


and the Z are 


...2 _L .,2 

Xu -f- Xji 

xjiXai, 

X2lX41 , 


X21X31, 

2 I 2 
Xu 4" Xjl 

etc. 


X 21 X 41 

XjiXii 
2 I 2 

Xu + X41 


Hence, for s = 1, n == 2, 


and for s = 1, n = 3 


2 A 2 (xui/i + X2iy2 + xny^y 


^ r ( - X2iyi + xiiy2)® + 


Xu X21 



Xu 4 - X21 - X2l1/l 4 - Xii2/2 

2 


X21X31 

- XnVi + Xii2/3 


(xu 4" X21) 

Xii 4“ X21 X21 X31 

2 1 a.2 

X21X31 Xu 4 - X3I 






K. VAJDA 


If, however, « 2 , ri == 3 , then easy ealciilatioriH lead to 


f -A 3 fJu i/i +-rsi !/2 4* ■1‘si j/a)^ 

fralfi = L>y^ ' .3 ' I 'Is ' I „i“' 

t«»i Xii *r T *^^31 


■Til + Th + -Th Tii^i + Xiij/i + Xjiya 

Til Tn + XSL T« -f* Tn Tm Tii tfi 4- XS5 1/J 4'_T31 1/j 

, , (Tn 4 ‘Tji 4 ’Tji XitXii + TsiTis + TjitJ 


fju + Tsi -f- tL 

/ jTsiTji I 


XiaTii + TjaTji + XijJai Xij -I- Xjj 4* Xjs 


jXaiXjp jXjiXiil ixiiXjt] 

i I ?/l “h I ! 1/5 + i ' J/3 

;X22X32! 1 X 35 Xu j IXisXjji 


jXjiXaif iXjiXuf XiiXji^' 

4-: 4- 


IxsjXmi iXjjXu! jxuXss 


As a Kpoeialiml eiw ('onmcler s = 1, findxu - Xn 
tlie aro 


” x«i - 1 , Then 


2 111' 

... 1 1 

12 11- 

• •<11 

1 1 1 1 • 

• ••2 1 

l 1 1 1 • 

■ •■12 

^(ty< 

Ti ViTi . 

kl n 

r |Mwl 


n \. i 


The aura, of squares into which can be tranaformed is then found to be 
2/1 + 2 /s)* + 2"3 (“ 2/1 “ 2 /* 4- 


4- 3;^(“ "*!/!" 1/3 4- Si/i)* 4- • • •■' 


^ Thifl is tlio result contained in a paper by J. 0, Irwin, "Indopomlonce of the eonstit- 
iicnl items in the analysis of vansricc” Suppl. Roy, Hlat, Sot. Jour. Vol. 1 (193'1). 




NOTES 


This section is devoted to brief research and expository articles, notes on 
methodology and other short items. 


ON THE ANALYSIS OP A CERTAIN SIX-BY-SIX FOUR-GROUP 
LATTICE DESIGN USING THE RECOVERY OF 
INTER-BLOCK INFORMATION 

By Boyd Harshbarger^ 


Virginia Agricultural Experiment Station 

1, Introduction. A detailed description for a six-by-six four-group lattice 
design 13 given m a recent article [1] by the author, and the analysis is developed 
which uses only the intra-block information to correct the varieties for the block 
effects Here is developed the analysis that makes use of both the intra- and the 
inter-block information 

Referring to Group X on page 307, [1], since block (1) contains varieties 1 to 6, 
and block (2) contains varieties 7 to 12, the difference between the means of 
these two blocks is also an estimate of the difference between the first six varieties 
and the second six varieties. The information obtained from such inter-block 
comparisons was ignored in the previous analysis. In attempting to use tins 
information, the chief difficulty is to decide how estimates derived from the 
comparison of block totals shall be combined with the previous estimates. 
Since each block consists of six plots, comparisons between block totals may be 
expected to have a higher error variance than the within-block comparisons, 
just as in split-plot designs the mam block comparisons usually have a higher 
error than the sub-plot comparisons. The problem is, therefore, to estimate 
the relative error variances of the inter- and mtra-block comparisons, and then 
to combine the two types of estimates to the best advantage 

2. Calculations of the adjusted varietal totals. In addition to the equations 
(7), [1]) which contain all the intra-block information, we now have the additional 
set of equations, 

B, = + (sum varietal constants in this block) + e., which are estimated 

by 

R, = Got -f Ivbt +■ Er. 


In these equations and all the following equations, the double prime symbol 
(") used in [1] is omitted, but the statistics have the same meaning as in equations 
(7), [1] except in this paper they are adjusted by both inter- and mtra-block 
information. 


1 The author wishes to express his appreciation to W G Cochran of Iowa State College, 
who advised in the preparation of this analysis, 

387 



388 


HOYI) H S.nHHJUU(iKK 


Tlip general problem i.s (o minimize the function, 


F - -- m 


hy + — (i?n - iili't.)’ 


3j3 w k 

subject to tlie reitrielion "■ where W = -"-and 

?•"! »"“I (T 


1 


H" 


i 
1 • 
<>■6 


Following the method given in (Ij the typical block equations for h^i ■ • ■ 1;^, is 
“ G 3IF'-1- IP ~ ““ (5 'iW -b TF' 


and for b,i ■ • • but is 

i.L 


B.i 


144 UP -b P')(3P + W 


[(2f)P* + 22 WW -b lF'“)C.i 


-b (IF - lF')*(r., -b (’rt)] + (Cur + Cut + 

it can be seen that for IF' = 0, lixi luid f)»i are the inlra-bloek values given in 
(1) and for IF' *= IF tliey are the randomized block values 
A typical adjustment variidal total then Incomes 

ur _ w' 

4ei "b 4fft ==> t”, — ^ (Ji,, 4- f)„, -b I'll b but). 

3, Estimation of IF and W. Following the method presented by (“’oehron [6] 
and Yates [3], the error of a block total may be written as 

Ei == ca -1- e,j + • ■ ■ -b e,» + Gfi! 

where 

F(c) = (t’ and F(b') = al 

Hence F(jBi) ■= 6 a* -b SGaJ and component (a) is thus an e'stimate of a* + Ga?. 
One finds from evaluating the expected value of (15), [1] corrected for replicates, 

, that the expected value of component (b) is o-* -f f-Ga*. 


- - 


SbSC \ 


6 / 


In the analysis of variance if components (a) and (b) lu-e pooled, one obtains the 
block variance B as an estimate of a* -b J-Oa*. Since tlie intra-block variance 
is an estimate of a* the estimates of the true variance, between blocks, a* -b Ga*, 
8 B - B _ 1 

7 “ ipf/ • 

4. Standard error of adjusted varietal means. The standard error of the 
difference between the adjusted means of two varieties which appear together in 
the same blocks in groups Z or U, is 


4fcP 


(fc - 2) -b 


8P 


3P -b W‘ 


-]■ 



A LATTICE DESIGN 


389 


obtained by the method outlined by Cochran. Similarly, for the case in ivhich 
the varieties are together in the same block in groups Z oi JJ 

When an attempt is made to express the difference between these two adjusted 
varieties which appear together in the same block in groups X or F in terms of 
the levels of the main effects and interactions, the interactions are no longei 
unconfounded and the method employed above breaks down. 

If one is willing to assume that the formula for the variance of the difference 
between two adjusted varietal means for varieties which appear together in the 

1 / BW \ 

same block in the groups X or F is of the form I .d. + the 

constants may be determined by the values already known, [1] This form can 
be shown to be that for a quadruple lattice. 


The formula 




A + 


BW 




must reduce to the value for intra-block 


’ 241F V ' 31F + W\ 
analysis [1] when W' = 0, and when IF = IF' to the value for complete random¬ 
ized blocks When these conditions are imposed, the formula becomes 


1 ( . 80IF \ 

1441F \ 3F + W'j ■ 


This value is slightly larger than the value obtained when the adjusted varieties 
appear together in the same block in groups Z or V, as should be the case. This 
gives us a lower limit. One can arrive at the upper limit in the following manner: 
suppose the variance (intra)i obtained in the intra-block analysis for the difference 
between two varietal means such as vi and Vi is greater than that for Varietal 
means Va and (intra) 2 , then it follows that: 

(inter -|- intra)i ^ (inter -j- mtra )2 X 


Using this relation, the upper limit for two vaiieties together in the same block 
in groups X or F is 

1 / 12IF \64 
24IF \ 3F + W') 63 ’ 

which gives a value slightly greater than the formula derived, as it should if it 
is to be the upper limit. In a similar manner one gets the variance for the diffei - 
ence between varietal means not appearing together in the same block. 

S. Efficiency of the design to the randomized complete blocks. By the 
method outlined by Cochran [6] the efficiency can be shown to be measured by 
the ratio of 


A-L-L 

W^ W' 

“7 - — to 4 (average error variance of the difference between two plots). 

h L 

It will be noted, by using the above formula, that the gain in efficiency for 
the numerical problem given in [1] is 1.003, which for our purpose here is zero. 



390 


HILDA GKIUINGKIl 


This, in general, will not lie the case, for on most soils there* is a block diflerence. 
In this particular test the grenintl used had bceni previously Idled in with well 
mixed soil. The elficiemcy for the analysis given in 111 relative to the randomized 
complete hlocks was less than 1 (K) 

This papi'r and the previous one show what, a long tedious jirocedure is neces¬ 
sary to analyze the data, when the design does not follow the ruh's for the 
construction of the lattice, Irijile lattice, etc. The coinplexily of these methods 
stresses the importance, to thos<> designing ex|ierimeiits. of not deviating from 
the established design if the most information is to he secured from the data with 
simple ealeulations. 


HKI'KHKNCKS 

[1] HoYi) IlAiiSHBAiuiKK, “Oil till' iiiiiilysiH C)f a certain six-hy-siv four-group lattiec design,’’ 
A/imls of Math. ,S'lal., Vol is, No. 3 (1'I4-1), pp. 307 32(1. 

[2| 1'’. Yaies, "A new inetliud of arranging variety trials involving a large number of 
varieties," Jimrnnl Ayn. Ncirnrr, Vol. 26 (lil.'lfi), pp. 121-,l.‘ir). 

[,3] !■' Yati.s, “TliC recovery of latei-hlnck iiiforinalion in lluee tlimen.sional liitlice,,’’ 
Ann /iugi’iitcK, Vof.!) (1(139), p|i. 136- irsi. 

|4) F Yates, “The tecovery of inler-bloek inforinatioii in balanced ineoniplete, block 
designs,*’ .4nn. KuyrntCH, Vol 10 (1910), pii. 317 3'2f). 

15] F. Yates, “Lallice siiuares,'* Journal Agri. .SViViirr, Vol. 3 (lillO'i, pji n72 (>H7. 

[0] 0. M. Cox, It (’.. Kckhaiuit, and W. C. CoduiAN, “'I'lie analysis of laltice and triple 
lattice, cxperiinenla in corn vunetnl teals,’’ hnrn Agri. I'Jjp Sin. Rm. Hid , Vol 
Ml (1040). 


FURTHER REMARKS ON LINKAGE THEORY IN 
MENDELIAN HEREDITY 

By Hilda Geiiungbr 
Wheaton College 

In the follow'ing an explicit formula for the distrihution of gc-notypes in ease of 
three Mendolian chai-acters will he given [formula (5)]. The eomplete discussion 
of the case m = .3 suggests a supplement (as stated in the hist paragrajih of this 
papei) to the general limit theorem dealing with m eharaeter.s. 

In an earlier paper' recurrenee formulae have, Ixieu derived which furnish the 
distribution of genotypes in the nth generation if the distrihution in the (n — l)th 
generation and the "linkage, distrilmtion” (l.d.) are known. It was also 
shown how to "integrate.” this system of differenee c'tiuatioiis so as to determine 
the distribution in the nth generation directly from lhat in th(» Oth generation. 
This last method, though straightforward, requires howower in each piuticular 
case quite a few operations. 

In case m, the number of Mendeliau cliaraeters, wpials two, an explicit 
formula for the prolilem in question had been known. Denote, by p(xi, xs), 

I Hilda Gbiivinqer, Annals of Math. Slat. Vol. 16 (1944), pp,25-57. Tho notation 
in the present Note will be the same as in this paper. 



MENDELIAN HEREDITY 


391 


(xi, Xa = 1, 2, • • ■ k), the “distribution of transmitted genes” in the original, 0th, 
generation, by p^"’(xi, Xa) that in the nth generation and by c the “crossover 
probability” (c.p.). Then the simple formula holds 

(1) , Xi) = (1 — c)"p(xi jXa) + [1 — (1 — c)"]pi(xi)pa(xa). 

This may also be written: 

(!') p‘"\xi , Xa) = Pi(xi)pa(xa) + (1 - c)”[p(xi , Xj) - Pi(xi)p 2 (x 2 )], 

where p.(x,) are the marginal distributions derived from p(xi, xa). (1') shows 
that, if in case of independence of the original distribution, p(xi, xa) = Pi(xi)pa(x 2 ) 
then p^"* (xi, Xa) = p(xi, xa) for every n. The same is true for arbitrary p(xi , xa) 
if c = 0. Otherwise, if c > 0 the second term to the right in {!') tends towards 
zero as n —> 00 and the well known limit theorem results. 

In case m = 3, a remarkably elegant explicit formula exists’ which may be 
deduced from the author’s general theory. In this case the 1 d. is completely 
equivalent to the three c.p.'s Cia, Caa, cu . The c,, are probabilities with sum g 
2, and for which the triangular relation 

(2) “h C]k ^ 

holds If Z(ei, «a, «a) («i = 0,1) denotes the eight values of the l.d. we have (see 
quot. [1], p 32) KOOO) = Z(lll), i(lOO) = KOll), KOlO) = Z(lOl), 1(001) = J(llO), 
hence three independent values only. We may introduce 

, . 21(000) = i;(000) = Vo , 21(100) = a(lOO) ^ t;i, 21(010) = r(OlO) ^ Vi 
' ^ 21(001) = a(OOl) = Ws ; ro + + va + fs = 1. 

It follows easily that 

(4) = V, + V,, (.t 7^ 3, i, 3 = 1, 2, 3). 

The original distribution p(xi, Xa, xi) has marginal distributions p,j(x,, x,), 

р. (xi). These values will be denoted briefly by pias, pia, paa, pw , Pi, P 2 , Ps 
respectively. Writihg in an analogous way p^"’ (xixaxa) = p{aV the new formula is 
the following: 

P 123 ’ = PiPaPa + [(wo + Ki)" - Vo^KPiPzs - PiPsPa) + [(vo + Wa)" - l'o”](p2Pi3 
— P 1 P 2 P 3 ) + [(i^o + «3)" — I'o'KpsPia — P 1 P 2 P 3 ) + ao"(pi23 — P 1 P 2 P 3 ). 

Tills useful formula permits to compute readily p^aV for every n. In terms of the 

с, 3 , writing 

( 0 ) d{j = 1 — c,y, tio = 1 ~ + caa + C 13 ), 

it reads 

(50 Pias’ = P 1 P 2 P 3 + (dto - V?)(PiP23 - PiViPi) +■ + • 3^o'(Pi 23 - PiPaPa) 

’H. S. JbnninOs, Genetics, Vol. 12 (1017) pp 97-154. 

’ Professor Pelix Kernstein oalled this author's attention to the biologically interesting 
case m = 3 



HIUU (iElUINGKH 


In these formulae; tliti n^e of indepenflcru’e of tlie original di.strihntion is clearly 
Rcm: If p,, « p.Pt ainl pm -- PiPsPi then pJsV = pm for every n and every l.d. 
The same hoklK for every n. and every pm if I'o -= 1, which iinplie.s that all c„ be 
zero. If in (fi') all dy, < 1, henee all e,, > 0 the limit theoiem lim pS^, = 

rt-*w 

PiPiPs resiiUK. Ct, > 0 mean« that eoniplefe linkage t>et\veen any two genes is 
excluded. If, on the other hand, e.g. eo > t), > t), to -f- I'l dn — 1, Cm = 0, 
henee iv < 1, vt » i-j ~ 0 we get plsV ^ • ptp 2 s . If Cjs Cu — 0 the triangular 
relation t2) shows that eu ~ 0 trK>, a ea.se eonsiflered above. 

It should Ive notieed that (5) is, of rnur.He, in agreement with the author's 
etjuation (41) in quot. [Ij, It only has to In* ohservffd,--an obvious fact not 
nientioneti in my earlier paper,- -that in tlu' former .srdiip the sum of all the 
for e\'ery fixed rn equal.s one. 'rhua for m. — 3: 


(7) 

«iii 

+ a'.H + «5,u 

, Int 

+ “J.l* 

+ <tu7a ~ 

1, (for ei'cry 

and 







(8) 

_(■») 

^ c;, 

= (I'O + 

vir - i-o 

- dn 

- Uo. 



«5,U 

= (I'a + 

i-})" - Co" 

= di"} 

~ I'o, 



“3.1} 

(e# 4- 

r*)" ~ Co" 

*= (in 

- fj. 


The preeeeding complete disemssion of tiie ea.se m =- 3 .sttggests a remark 
concerning the general ease of m eliaraeters. In my earlier pairer the influenee 
on llu! main limit theorem of eerf ain ways of degeneral ion of the l.d, had not been 
explicitly considered. In the following we ahall uae the e-distrihution which 
is a little shorter to write than the l.d. l(ti , «*, ■ > • (*). Tlie e-distrihution con- 
tain.s only 2'""‘ values with sum one, defined in a way similar to (3). The main 
limit theorem ([1], theorem 11, p, 42) states in our present notation that 

(9) lim 7)nT.,„ « pipv •-Pm, 

if “complete linkage" between any group of genes is excluded. That implies 
that not only Co ^ y(0,0, ■ • • 0) = I must be excluded hut even I'l,. *((), • • • 0) = 
1, where this last probability denotes a marginal distribution of thoe-distribution 
of an order g 2. To assure this it is necessary and .sufficient that no 0) = 1, 
ornod(j^u,X0,0) == 1, or no c.^ = 0. Hence (9) holds if and only if no Cij =» 0. 
If this condition is not satisfied the l.d. degenerates in various ways and the limit 
theorem is to be modified accordingly. If, in particular, I'o » 1, all c,j = 0, and 
Pis’, m “= Pii « for every n. 

Between these two extreme eases (“no c,j » 0 ", “all c,j 0”) are the different 
possibilities of r < m groups of completely Imkcd-charartcrs (see [1] p. 30, iv)), 
Consider e.g. m = 7 and euMCOOOO) = 1, uwrfOOO) « 1 (this is realized if 
e(OOOOOOO) > 0, u(OOOOill) > 0 withsum of these two numbers equal to one) then 
Urn pn?. -7 = PisM PM 7 . Here the four characters 1,2,3,4 act as one character and 

p[m = Pm for every a. Also pm == Pm? • Or if, for m = (5, dij = dw = dw =» 1 
(realized if w(OOOOOO) > 0, u(llOOOO) > 0, a(OOllOO) > 0. u(OOOOll) > 0, with 



DEI'INITION OF DISTANCE 


393 


the sum of these four values equal to one) then pij’ .e —> pnVMPu If however 
for m = 6 merely du = dn = 1 (realized if, in a notation analogous to (3), vo ,vs, 

Vi, Vit, vn , vu , «i 26 , ri 26 are the only non-zero values of the l.d.) then —> 

PnPuPiPo . 

In general, with a proof which consists in a modification of the reasoning (p. 
41), of ray earlier paper, we may state the following complement to the main 
limit theorem (9): If the l.d. is such that r < m disjoint groups Gi , ■ Gr 

of completely linked characters exist,~i e. such that mthin each group no crossover 
takes place, each group containing as many of the m numbers as compatible with the 
definition but not less than two, and all groups together containing s ^ m of the m 
elements, then, as n —* «>, pJa? m converges towards the product of those marginal 
distributions (of the original generation) which correspond to these groups multiplied 
by the marginal distributions of order one of the remaining free elements which are not 
contained in any such group In a formula 

(10) hm Poi.Oj, . Or.ri + liY.+l. ■•'tm ~ P®i Poj • Po r Py , + lPy , + i Pym- 

"We may also characterize these linked groups of maximum size by stating that 
while within each group no crossover takes place there must be at least one c.p. 7 ^ 

0 among any two such groups and at least one among any group and any free 
element. It may however be noted that if there is one c.p. > 0 among two 
groups of complete linkage (or among a group and a free element) then all c.p.’s 
among these two groups are different from zero. In fact, it follows by repeated 
uso of the triangular relation (2) that if one c.p. among two disjoint groups of 
complete linkage is zero, all of them are zero. If, e.g., (1,2,3) and (5,6,8) are two 
groups of complete linkage, i.e. i;i 23 ( 000 ) = 1 and V6 m( 000) = 1 and if besides 
ci( - 0, then «i 23668 ( 000000 ) = 1 and these six elements form a group of complete 
linkage. 

It may be noticed that the above statement of the generalized limit theorem 
becomes simpler and more elegant by counting “free elements" as groups. It 
might then run as foUows: // (?i, G 2 , • • • Gi{t g m) are the maximal groups.of 
completely linked characters, then, under the hypotheses of the earlier paper, the gene 
distribution in successive generations approaches a limit in which the original (mar¬ 
ginal) probabilities within each group Gt are preserved and genes and sets of genes 
fromd ifferent groups are independently distributed. 


OH THE DEFINITION OF DISTANCE IN THE THEORY OF THE GENE 

By Hilda Geiringee 
Wheaton College 

In several letters to this author Dr. I. M H. Btherington of the Unwersity of 
Edinburgh has raised questions concerning the author’s definition of distance 
proposed in Section 10 of her paper on Mendelian heredity, comparing it with 


> Annals of Math. Stat., Vol. 16 (1944), pp. 25-57 



394 


HILDA OEIRINOKH 


the definition imijlicit in Professor .T. B. H. Haldanp’H earlier treatment.^ The 
main content of the author's paper eonsists of home, general limit theorems and 
the integration of a certain system of difference equations. The distance defini¬ 
tion is a by-protluct Huhject to discusbion. 

"Distance” d,j lietwcen two genes i and j i.s defined by the author as the 
mathematical exiwctatiun of Ihe number of cro-asovers in the interval (t,j) with 
re.spcct to the. “linkage dihtribulion” (l.d.). This ba-sie concept is introduced 
as follow.s (page 32); If <S' is the Sf‘t of number,s 1, 2, ■ • • m (m being the number 
of Mendeliun character.s), A any sulwet of A and A' — <S' — .4, we denote, by 
1{A) the probability that an individual with "maternal” genes ai, • ,a:„ 
and paternal genes j/i, • • •, transmit the paternal gene.s belonging to A and the 
maternal genes belonging to --lb The-se 2"’ probabilities eoinstitute the l.d. 
From these definitions tlie equality ((}. (53')) 

(1) dij == -p + ■ ' • + Gi-ij {i < j) 

is derived, where c,, is the probaliility of a "erossover” (c.p.) in This 

distance has the required additivity; (tl. (.')‘l)) 

(2) dij + djk — dik, (i <j <k), 

Etherington points out that the term “distance” ]m.s an established eurrcncy 
in genetics being the basis on which chromosome miqis are eonstructed, and 
that there, is a standard rnethoel of ealeulating if in aeeordimei* with which (1) 
is an "approximation valid only when (he adjneenl c.p.’s are small," Moreover 
"the biological uniqueness has been lost for the value, of d,, now depends on the 
particular act of intermediate gene.s which we happen to In* cousidering, If any 
of them are omitted from eonsideiution then the iruHpuilily ((I, (13)). 

(3) c,j -p Ojk ^ Cik 

shows that in general d,/ is diminished while if new genes are taken into con¬ 
sideration di/may increa.se,” “In order that d,, should not depend on a particu¬ 
lar choice of intermediate genes the word ‘crossover’ in the definition given would 
have to be interpreted as ‘chiasma’ instead of ‘odd number of ehiasmata’, and 
then dij' cannot be evaluated in terms of the l.d. alone ivilhout further assump¬ 
tions regarding the interference of cro.ssovers.” 

The point of view adopted in the author’s paper was to regard tlie l.d. as the 
basis from which everything else has to be inferred. The number m of Men- 
delian cliaractor.s is considered eoiustant and the di.stanee, being a mathematical 
expectation with respect to the l.d. neces.sarily depends on it. In this conception 
distance is not a geometric property which can he. measured for any two genes 
independently but rather a system of m(.m — l)/2 consistent mimbei'H assoeiated 
to the m genes. Tliere is no choiec regarding the intermediate genes to be taken 
into consideration; all known genes are, to be consideied, i.e. one has to use the 
available relevant information m order to determine the l.d., the c.p.'s and the 

’ Quotation [4a] in the author’s paper, RefcrenceB to these papers will be diatinRuishecl 
by the initials H and G. 



DEFINITION OF DISTANCE 


395 


distances. If the information is incomplete the results will be provisional and 
subject to change; if it is satisfactory the same will be true for the distances. 
Thus it is nothing but natural that dfj is changed if some genes are omitted from 
consideration, or if new genes are discovered. In this set up “crossover”— 
defined by means of the marginal distributions of second order of the l.d.—means 
a transition from the paternal to the maternal set or vice versa. (Expressed 
in terms of the chiasma-hypothesis this means “odd number of chiasmata 
between adjacent genes.”) Additional assumptions “regarding the interference 
of crossovers” are neither necessary nor admissible. All this is contained in the 
l.d. 

Haldane’s approach as translated by Etherington into the author’s notation 
IS as follows. “The genes are considered to be distributed continuously along a 
chromosome. Thus this approach unlike G.’s is not based on the l.d. of a 
finite set of genes. We must think of one suffix, i, as referring to a gene at a 
fixed locus on the chromosome, the others to variable loci, so that the c.p.’s 
are variable. For any three genes i, j, k a quantity p is defined hy the equation 


(4) c,* = Cu + c,k pc„c ,^, (i < y < 

Biological considerations show that p is a number between 0 and 2 (small when 
Cij and Cjk are both small, increasing, on the whole, with c,, + Cj,). The distance 
D,j IS defined by the statement 

(5) Dk,/Ck,—>1 as fc approaches i (cil->0), 

together with the additive property, and from this with (4) Haldane’s general 
distance expression is derived: 


( 6 ) 



dc,3 

— poc,/' 


Here po = Poic,,) denotes the limiting form of p when k approachesy, and repre¬ 
sents biologically a property of the chromosome segment (i, y), a measure of 
interference. Any suitable specification of this function po(Ct,) would constitute 
a mathematical ‘model’ of the chromosome. If p were constant we should 
have Po = p and 


(7) D.j = - ^ log (1 - pc,,). 

Both Haldane and Geiringer considered the special cases p = 2 (no interference) 
and p = 0 (complete interference) for which respectively 


C7') 


Dij = - ^ log (1 - i c.,) 


(7”) Dii = c,j = d,j. 

Since p is always between 0 and 2 Haldane concludes that the true value of D,j 
IS between (?'•) and.(7”), and he gives reasons for saying that (7') is nearly correct 
for genes ‘far apart,’ (7”) for genes ‘close together.’ ” 



396 


HILDA QKIRINOKR 


If the author is right, this kwrih to lx* the standard definition accepted in 
genetics as mentioned above by Etherington. A few, not exhaustive, comments 
may be added Writing in (6) t for the variaiile of integration and p» = po(0 
it is seen that the expression 

dt 

®) "■' " I -1 “<>.(,) 

contains tlie unknown function po((), which is uim{K'cified except for the state¬ 
ment that it is iioimded iietween 0 and 2. It is immediately Hi^en that with an 
arbitrary pa(i) and without a restriction taking the place of (4) this distance (C) 
will not be additive in the sense of (2). By imposing, after a ehoiee of po(l), 
appropriate restrictions on the c,, additivity may he achieved. For instance in 
the particular case pUO = p ■= const, (2) holds by virtue of (4). For such a set 
of restrictions it has then to be proved that the corre.spcmding "model” i.s “eon- 
aistent,’* i e. that the. so re.strieted c.p.'s form a compatible set of marginal 
distributions of second order of an m-variate di.stribution, the l.d. 

These different points will he exemplified presently by studying the particular 
case Pj(0 = p, whore p is a suitably chosen constant; the parameter p is to be 
fitted to the observations under consideration. It may be impossible to repro¬ 
duce a set of observations satisfactorily if one parameter only is available. In 
fact, Haldane’s paper suggests that it is not only the particular ease p = const 
he has in mind. It seems Iiowever that if /),, is given by (6) with a non constant 
P(i(t), complicated and perhajis (biologically) not very meaningful eorulitions may 
have to be introduced in order to assure additivity of the distances and con¬ 
sistency of the respective model, Tliis author was unable to work out example.^ 
of more general and at the same time appropriate and fairly simple assumptions 
for the unknown function po(0- 

If p = const, then (7) under the restriction (4) furnishes an additive distance 
definition because: 

- p[Dij -f Djk] = log (I - pc,j) -f- log (1 - pcjk) 

= log (1 - pc,y - pcy* -b P^C.,,Cn) = log (1 - pC,*) = - pD.i, 

because of (4). Let us now investigate whether there is a consistent system of 
c.p.’s satisfying (4). Put, as in G.(48), c,.,+i <= p,, combine(4) withG.(50) and 
write p == 2«. It follows that (4) is satisfied with 0 g e ^ 1, if; 

(8) pi] “= tpiPj , piik =• tYpipk . • • • . 

Here pty is the probability of the simultaneous occurrence of the "events” 
numbered i and j, etc. For < » 0 we get "disjoint events" (see G.f) for the 
discussion of consistency). Assume now < > 0. By some considerations, 
analogous to those p; 54 G, the following necessary and auffioient condition of 
consistency follows: 

V (1 - *p,) ^1-6 

»-i 


(9) 


(e > 0). 



pbfinition of distance 


397 


This restriction (not considered by Haldane or Etherington) is, of course, 
relwant.^ If e.g. m = 3, pi = pj = 4/5, then 6 must be ^ 16/16; or if jn. = 4, 

pi - P2 = Pa = i ^ 3 - VS results. The restriction required by the “linear 
theory is 


( 10 ) 



(i = 1, 2, • • • , m — 1). 


Hence this model is consistent under certain restrictions. It is, in contrast 
to Etherington s contention, different from iii) G. p. 54. The corresponding 
distance definition (7) is different from the author’s. The thus defined are 
additive, and depends on cvy only and not on the intermediate genes. The 
author s definition of distances, d,,, is general, additive and seems to the author 
to be well adapted to the biological situation; since the definition of d,, is not 
related to any particular model it is compatible with any model, which may 
contain any desired—consistent—assumptions about “interference,” etc. For 
example in G. iv) p. 55, an n-parametric model has been suggested which seems 
fairly flexible. 

It may however seem more acceptable to the biologist not to use a general 
distance definition but to define “distance” merely in relation to some sufficiently 
general “model" (such that the distance definition would vary with the model), 
instead of accepting an all-over definition as ventured in the author’s paper. 
The particular model (8) in connection with its related distance definition (7) 
might give an example of such an approach.®’ * 


’ Ab Etherington remarks, eq. (14') in the author’s original paper is not correct One 
can only state that (47) holds The mistake is however without consequence since no 
conclusions are drawn from (14'), The same mistake was pointed out by Professor Kai 
Lai Chung. 

* Etherington writes; “I have been kindly allowed to road Professor Geiringer’s MS. 
and feel that some comments are necessary 

The standard procedure for calculating the distance between two linked genes is as 
follows. A Selection of intermediate genes is taken and the adjacent crossover values 
calculated, giving a provisional estimate of the distance as in Geinnger’s formula (1). 
When further intermediate genes are added to the selection, it is found that the provisional 
distance increases, but there is apparently a maximum value beyond which it cannot be 
increased. This unknown maximum value is the distance, and the geneticist accepts (1) 
as the distance when he is sure that he has observed a sufficient number of intermediate 
genes to give a good enough approximation to the true distance, Thus Geiringer’s formula 
(1) gives the geneticist's true distance only on the understanding that it includes all genes 
intornuidiato between i and y; but generally speaking the great majority of these genes 
may be unobservable in, the sense that they have no observably distinct alleles by means 
of which the c.p.'s could bo calculated, though from time to time fresh genes may become 
observable by mutation. 

In some cases the above procedure fails because not enough intermediate genes can be 
obseived; then Haldane’s analysis is useful It should be emphasized that his distance is 
additive by definition. (For a geometrical analogy, think of the genes as points closely 
distributed along a curve, chords representing c.p's, Haldane’s definition of the distance 
is analogous tedefining arc length of the curve as a limiting sum of chords.) In my tran- 



398 


(’I,IFFOHD E. HEHHY 


Kcrimirni (»f liis Irfalniptil, I Khniilil iH-rlmps Iihvm itmdf il rlpartT tluU flip ilpnvpd formula 
<6? Riv(‘« only flit* (iiwlaiu'r: 1),, rnpBwurpil from tlm initially rlumim and fiwd Kcne i to an 
arbitrary rciip j Oilier diHtiuii'Ph />,i , (i < j < A‘J, an* doduml from it by the postulate 
of addilivily <I),k ~ - IK,)- If Ihn oriRin 11 » rhanRcd, tlicrc will he a similar formula 

fO), but )l sliould nnl be awuimed that tin* fuiiflion pi, is the same In rcferriiiR to certain 
conditions iiPcrsHary 'to awiire ailditivily,' tli-iriuKer evidently lueaiiB roiulitions that the 
function po may be the wiiiie for all origms i Thrw enriditiims would be* interpreted bio- 
loRieally as usserliiiK uinformily of iritcrfarenee alonj? the ehrornoHume. I aRtUe that there 
are further points to be elenred up in this ronneelum 

If I miRht sum up the diMCUssum, I woultl say that the Reiietirml’s eoiieeplion of the 
distance between Renea is an aetuiil property of tlia eorrespondniK uhromosoine, seRinent 
(ieiringer'a definitiun reiireseiila the liest ]iossible Reiieral aiiproaeh to this from the limited 
data of the I.d. alone. Haldane’s definilinn lita the genelieist's eoneeption, and his in- 
veatiRalion is an attempt Iti gel the liest ealimnle of the diaUnee by making approximate 
assumplions ns to wimt happens belween the olwerved genes. It is baserl on the, unob¬ 
servable crottsover-dislribiition of a siippoHed infinite aet of genes, hut can be applied to 
parlieubir models of Ibis infinite e.d. so as to derive reauUs which involve only a hinte and 
oliservable e.d. Finally it should be rnenlioned that in the paper quoted, Ilaldiiiie gave 
also an allernativi' melliofl for the ease p « ‘2, leading to the same formula (7'), which is 
really equivalent to defining the (listiinee as the matlieinatienl e\peetiition of the number of 
ehiasmata (not crossovers in (l.'s sense) in the interval (i, jl." 


A CRITERION OF CONVERGENCE FOR THE CLASSICAL ITERATIVE 

METHOD OF SOLVING LINEAR SIMULTANEOUS EQUATIONS 

Hy Cwm)uD K. Bkkuy 

ComoUdaied linQinrcrijig Corporalim, Pascuimia, Calif. 

The recent development of two devices'' ’ ftir hoIviuk linear Himultaneous 
equations by means of tlie elmical iterative inethoti’ lias stittiulaled the writer 
to inveatiRato converpience criteria for tiie method. Tliere are in the literature^ 
necessary and sufTicient criteria for convergenee of .symmetric aystemn, and suf¬ 
ficiency criteria for general ayatems. So far ii« the writci- knows, liowever, this 
is the first development of a neeeasary and sulReient criterion for convergence 
in the general ease. The lesults olftained are applienblo to any ariiitrary square 
non-singular matrix in wlncli a„ ^ 0. 

Let tlie Hot of equations lie rcprescntc'd i)y 

(1) ylA" = U, 

' Morgan, T. I)., Orawfunl, F, W., "Time-Having comimting iimlrumenis tlpsigriod 
for speetrosGopie imalyHis", 'fVie Off and f?«s Journal, August '20 (111hi), pji. 100 IDS. 

>Berry, C, K,, Wileox, I). K , Uoek, K. M., Washburn, If, W., "A cumpulcr for solv¬ 
ing linear simuIlaneuUB oiiuations", to bn puldisluid. 

" Ilolelliiig, Ibirold, "Home now inolhods in luatrK culeulalum", Thn Annate of Malh- 
emalical iStalielicH, Yo\ XtV (1913), pp. 1-31. 

■‘Miscs, It. von and Bollae?,ek-(jeiringer, Hilda, "ZusammentusHCude Berichte,. Prak- 
tischc Vcrfuliren der Cilcichungsaullimung” fieiUchrifl fi'ir angeumidte Math, unrl Me- 
chanik, Vol.') (I!I29), pp. 5H-77, and 162-161. 



A CRITERION OP CONVERGENCE 


399 


in which A is the square matrix of the coefficients, X is the column matrix of the 
unknowns, and G is the column matrix of the constant terms. | | is the de¬ 

terminant of A. 

We define a matrix Ai which contains the prediagonal and diagonal terms of A, 
and a matrix dj which contains the postdiagonal terms of A. According to this 
definition, 


( 2 ) 


-d-i -j- Aj = A. 


In the classical iterative method, arbitrary {or approximate) values of the x’s 
are chosen, the first equation is solved for the first unlcnown, the second equation 
for the second unknown, etc , using in each equation the most recent approxima¬ 
tions to the a;’s. This process may be written 

(3) AiX" -h = G, 

in which Z^”' is the initial approximation matrix, and Z'" is the approximation 
matrix existing at the end of the fii'st iterative cycle The superscripts indicate 
the number of the approximation. The next cycle is described by 

(4) A,Z‘« + AsZ'” = G, 


and the with by 

(6) AiZ'"*^ + AsZ^""'’ = G. 

The method yields a solution, i.e., converges, if 

Hm (Z^"’ - Z) = 0. 

m-»oo 

Solving (5) explicitly for 

(0) Z'’"' = AtG - 

Subtracting Z from each side, 

(7) Z'"*’ - Z = Ar'G - Ar'AjZ'’"-^ - Z, 
and making use of (1) and (2) 

(8) Z'"” - Z = -Ar'A^cz'”-" - Z) 
Since (8) applies for any value of m, we may write 

(9) Z'"*’ - Z = (-Ar‘A2)^(Z'’’‘-“ - Z), 
and continuing this process, 

(10) z'"> - Z = (-Ar‘A,)”(Z'“> - Z). 
Now, lim (Z‘”^ - Z) = 0 if and only if 

lim (—Ar^Aj)” = 0- 


( 11 ) 



ClilFrOED R. nKRIf? 


4(W 

This is a general n‘«uU, appliesblo to any arraiig(‘inent of the tcniiR of an ar¬ 
bitrary square matrix .4, Hubjeet only to the, conditions that j ,*4 j ^ 0 and that 
no diagonal term of A is zero. In this latter exceptional case, the iterative 
method itself obviously cannot be applictl. 

The criterion (11) clearly hIiows that the order in which the, elements of the 
matrix A are arrangcil is important. For instance, it is plain that an arrange¬ 
ment in which the diagonal terms are large and the ofT-cliagonal terms, particu¬ 
larly the i> 08 t-(iiagonaI terms, are small will tend to favor convergence. 

A somewhat relaxed condition, which is sufficient but not necessary, is ob¬ 
tained through the use of an inequality u.sed by Hotelling®, namely, 

(12) N{in < (A^(f?)]’", 

in which NiB) is the norm of the matrix Bf that is, the square, root of the sum 
of the products of its elements by their complex conjugates, or in the case of a 
real matrix the square root of the sum of the squares of the elements. 

The condition is that, if 

(13) NiAT^Ai) < 1, 
then 

(14) lim (Ar’Aj)"' “ 0. 

Criterion (13) is readily computed, since A'i ‘, the reciprocal of a triangular 
matrix is readily computed, and the post-multiplication by A* involves a number 
of zero terms. 

A more stringent concUtion than (13) though still not a necessary condition, 
is that if some finite number p can be found such that 

(16) i^(Ar‘A,)'’ < 1, 

then (14) follows. Since n matrix squarinp result in a value of p » 2", the size 
of the norm for t^ly large values of p can be investigated without excessive 
labor. 


A REMARK ON INDEPENDENCE OF LINEAR AND QUADRATIC 
FORMS INVOLVING INDEPENDENT GAUSSIAN VARIABLES 

By M. Kac 
Cornell University 

The purpose of this note is to call attention to the following useful theorem, 
which to the best of my knowledge was never stated explicitly. 

If Xti X}, X), • • ■ X„ arc identically distributed, independent Gaussian random 
variables each having mean 0, then the necessary and sufficient condilion that 

n n 

T! ay*X/X* and ocjX) “ a-X 



A REMARK ON INDEPENDENCE 


401 


be independent, is that 


Aa = 0, 

w/iere A is the mtnz of the quairaiicfom, a the vector (ai, aj, ’ •, a„) and X the 
vector (Xi, X 2 , X„), 

Proof op sufficiency. Since Aa. = 0, it follows that 0 is an eigenvalue of A, 
and 0 ! is a corresponding eigenvector. 

Denoting by X 2 , ■ , X„ the remaining eigenvalues and by i 32 , • ■, the 

corresponding eigenvectors, we have 

j,i“l j-2 

Since the |3’s are orthogonal to a, it follows that the linear combinations (8 ,”X 
are independent of a X, and this completes the proof, 

Proof of necessity. From the assumption of independence it follows that 

fl / n y n 

^ ^]kX]X]s and, ^ ajajcXjXk 

Ik’-l \l-l / j,fc-l 

are independent. Thus by Craig’s theorem^ 

AB = 0 


where J5 = ((a,-at)). 

This implies almost immediately that Aa = 0. 

* Added tn proof. Dr. L, Guttmun has kindly pointed out to me that the proof of 
Bufficiency given here has been used by D Jackson in the article "Mathematical principles 
in the theory of small samples", Amer Math. Monlh , Vol 42 (1936), pp 344-364, see in 
particular pp, 364-355 Jackson considers only the independence of i andwhich is of 
crucial importance in deriving student's distribution, 

’A,T. CaAiQ, Annals of Math, Slat., Yol. 14 (1943), pp. 196-197, see also H. Horn- 
LINO, ibid., Vol. 16 (1944), pp 427-429 



ABSTRACTS OR PAPERS 

Presented on September 16, 1946 at the Rutgera meeting of the Inetitulo 

1. On The Variance of a Random Set in n Dimenaione. IIeriiert Robbins, 
Lieutenant XJSNR Postgratlimte School, AnnapoliH, Md. 

Using a general formula fnr the mnnienls of the measure of a random aet A' (Ann. Math, 
Hlal. Vol. XV (1944), pp. "(W4) we find the mean and varmnee in the ease where JC is a 
random sum of n-dimensional intervala with aides parallel to the eonnlinate axes, thus gen¬ 
eralising the results previously found (lor. cit.) for the case n « 1, 

2. The Non-Central Wishart Distribution and its Application to Problems in 
Multivariate Statistics. T. \V. Andbiihon, Princeton University. 

The uon-ficiilral Wishart distribution is the joint diHlribulion of sums of squares and 
(Toss-produr.ts of deviations of observations from multivariate normal distrilmtionB with 
identical variance-covariance matnees and with different seta of means. Tlio rank of the 
non-cenlml Wisliart diatrilmlion is clefmetl as the. rank of the nialri.x of sets of means. In a 
previous paper (by M. A. (liraefiick and the iircsent author) the non-central Wishart dis¬ 
tribution is given explicitly for the rank one and two cases and indiealed for the case of any 
rank. In the present paper tlie characteristic function of the non central Wishart distribu¬ 
tion is given for general rank. The distribution, which is given in the form of n multiple 
integral, is the product of a eentral Wishart diKlrihulion and a syinmelric fimelioii of the 
roots of a delcrmiimntal eiiualion involving llie iimtrix of Hiiimres and cross jiroducts of 
observations and the matrix of population means. It is shown that tlie convolution of two 
nori-conlral Wishart dislrilmltons is again a non-central Wishart distribution if the. vari- 
anco-covariance matrices are the, same. The, moments of the gcneraljr.etl varianoe and the 
moments of the likelihood ratio criterion for l®Bling certain linear hypotheses (for example, 
tlie hypothesis that the means of a act of populations are identical, given that the matrices 
of population variances and covariances are the aame) arc oblaiiied for the linear and planar 
non-oonlral cases in terms of infinite series. Likelihood ratio oriteria are developed for 
testing the dimonsionality of the means of a set of multivariate populations (with identical 
variances and covariances) on the basis of one sample from each. The criterion tor testing 
whether the dimonsionality ish in the space of p dimensions is a symmetric function of p~h 
smallest roots of the delerminantal equation involving the sample estimate of the matrix 
of variances and covariances and the sums of squares and cross-products of deviations of 
sample means. The maximum likelihood estimate of the hyporplanes and positions of 
means on them are obtained. The asymptotic distributions of the criteria are x’- 
distribuUona. 

3. The Effect on a Distribution Function of Small Changes in the Population 
Function. Bueton H, Camp, Wesleyan University. 

It is generally assumed in the application of dislnbution theory llial, if the actual popu¬ 
lation function is not very different from the one used in the theory, then the true sampling 
distribution of a staUstic will not bo very dillcront from the one obtained in the theory. 
But elsewhoro in mathematics wo do not assort that a conehisinn will lie only slightly modi- 
fled by a small deviation in tho hypothesis. This paper presenls some theorems which are 
useful in determining the maximum effect on a sampling distribution of certain kinds of 
small changes in the population function. 


402 



ABSTRACTS OS’ PAPERS 


403 


4. Composite Distributions. Casper Gopeman and Benjamin Epstein, West- 
inghouse Electric Corporation. 

Let/(i, 01 , 62 , • • , 0 n) be a function such that for every point 01 = 010 , , 0 n = 0 no m 

parameter space, a: is a random variable with pdf. /(*; 0io, • , 0„o). Suppose further 
that the parameters 0 i , 02 , ■ • • , 0 „ arc themselves random variables whose p d.f's are 
given respectively by iplfii) , ■ Using a concept of “probability contained m an 

interval” and an axiom based on this concept, we show that a: is a random variable with 
p d.f, g{x) given by the formula 


( 1 ) 


Oix) = 



fix, 01 


■ , 0i>) <^(0i) 


(i>(0,i) dBi • ddv.. 


In this paper wc consider statistical properties of the function g{x) m cases of particular 
interest in applications, The cases treated here are (a) where the mean, i, is the only vari¬ 
able parameter, (b) where the standard deviation, o-, is the only variable parameter, and 
(c) where the mean i, and the standard deviation, a, are both variable parameters, £ and c 
being independent 

It IS shown that problems (a) and (b) are equivalent respectively to the sum and product 
of two independent random variables, one of which has zero mean. Pormulae for the 
moments in problem (c) are then derived in terns of the formulae obtained for (a) and (b). 

5. Population, Expected Values and Sample. E. J. Gumbel, New School for 
Social Research. 

Let X be an unlimited continuous variate, and let F(i) be the piobability of a value equal 
to, or less than, r. Then the expected m*'' values fm , for 11 observations, ore approxima- 
tioiiB to the most probable m“'values and defined by = Fi + (F,, — F,) (m — 1)/ 

(n — 1), where Fi and F„ are the probabilities of the most probable first and the most prob¬ 
able last value, The probabilities Fi, 1 — F„ and (F„ — F,)/(n. — 1) are of the older of 
magnitinie 1/n, 

The distribution of the exiico,ted values f,., differs from the distribution of the sample 
and from the tlieorelical distribution. However, for a symmetrical distribution the mean 
and the odd monienls about mean calculated from the expected values coincide with the 
mean and the momcntB of the population. For the normal distribution, the expected 
standard deviation <r(n) divided by the standard deviation cr of t he pop ulation and traced 
on normal prolialnhty paper approximates a linear function of Vlog n. The approach of 
(t{ii) toward a is slow For S(K) oliservalions, <r(7i) is about 99% of a The momenta of the 
dwtnbutiim of the expectccl values exist even in the case that the moments of the theoretical 
distribution diverge. 

(i On Optimum Estimates for Stratified Samples. Morris H. Hansen and 
William N. IIiiinviTZ, Bureau of the Census. 

A stmtiried siimple is drawn from a population with li strata. Noyman found the op¬ 
timum sample, alloeation for the “bcRt unbiased linear estimate.” However, biased but 

oonsistont estiinales of the form where both and y, are random variables have been 

V, 

found to give more reliable results in a large class of problems. Even more efficient esti¬ 
mates can be obtained by finding the values of n, (the sample size) and w, which minimize 

21 ^^ I 

the mean square error of estimates of the form Sia, or --- . 

y, Vi 



404 


ABfiTRACTS OF PAPEIW 


7, PeajBonjaii Conelation Coefficients Associated wiffi Least Squares Theory. 
Paul S. Dwyer, University of Miciiigan. (Itead by Title). 

In leBBt aciuarcB tbcory wo liave (he predicting variable *, the observed value of the 
predicted variable, v, the residual e, and the predicted value of the predicted variable y. 
The purpooe of this paper is to study tho Pearsonian ooefllcionts resulting from correlating 
all these variables in pairs (a) in the ease of a single predicted variable and (b; in tho case 
of two or more predicted variables. Tho results yit'ld such noofTicicnts as multiple correla¬ 
tion, multiple alimmtion, partial correlation, part eorrelation, and new coDlIicionts not 
previously in use. The resulU are given in expanded, determinant, and matrix form. A 
simplified calDulatinnal teelmifiue is provided. 




